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METHOD AND APPARATUS FOR Another smart-memory approach is "FBRAM," a 

OCCLUSION CULLING IN GRAPHICS memory-chip architecture with on-chip support for 

SYSTEMS z-buffering and compositing. With such a chip, the read- 

modify-write cycle needed for z-buffering can be replaced 
S with only writes, and as a result, the effective drawing 
BACKGROUND OF THE INVENTION bandwidth is higher than standard memory. 
The invention, generally, relates to computer graphics All of these methods improve performance, but they 
and, more particularly, to a new and improved method and involve additional expense, and they have other limitations, 
apparatus for rendering images of three-dimensional scenes Considering cost first, these methods are relatively expen- 
ding z-buffering. io sive wnicn precludes their use in low-end PC and consumer 
r i ■ r systems that are very price sensitive. 
Rendering is the process of making a perspective image of A * ■ n . . . 
a scene from a stored geometric model. Tne rendered image Ai ™ lC ? T three-dimensional rasterization system 

is a two-dimensional array of pixels, suitable for display *™?L?£* ^ T I "T^T * * ^ ? 

_ , t . . . - . . . . , frame-buffer memory system, which in turn consists of a 

The model is a description of the objects to be rendered s j ng i e bank of memory. Such a system cannot be highly 

in the scene stored as graphics primitives, most typically as 15 interleaved because a full-screen image requires only a few 

mathematical descriptions of polygons together with other memory chips (one 16 megabyte memory chip can store a 

information related to the properties of the polygons. Part of 1024 by 1024 by 16 bit image), and including additional 

the rendering process is the determination of occlusion, memory chips is too expensive. 

whereby the objects and portions of objects occluded from Providing smart memory, such as FBRAM, is an option, 

view by other objects in the scene are eliminated. 20 but the chips usually used here are produced in much lower 

As the performance of polygon rendering systems volumes than standard memory chips and are often consid- 

advances, the range of practical applications grows, fueling erably more expensive. Even when the cost of this option is 

demand for ever more powerful systems capable of render- justified, its performance can be inadequate when processing 

ing ever more complex scenes. There is a compelling need very densely occluded scenes. 

for low-cost high-performance systems capable of handling 25 Moreover, neither interleaving nor smart memory 

scenes with high depth complexity, i.e., densely occluded addresses the root cause of inefficiency in processing 

scenes (for example, a scene in which ten polygons overlap densely occluded scenes, which is that most work is 

on the screen at each pixel, on average). expended processing occluded geometry. Conventional ras- 

There is presently an obstacle to achieving high perfor- terization needs to traverse every pixel on every polygon, 

mance in processing densely occluded scenes. In typical 30 even if a polygon is entirely occluded, 

computer graphics systems, the model is stored on a host Hence, there is a need to incorporate occlusion culling 

computer which sends scene polygons to a hardware raster- into hardware Tenderers, by which is meant culling of 

izer which renders them into the rasterizer's dedicated image occluded geometry before rasterization, so that memory 

memory. When rendering densely occluded scenes with such 35 traffic during rasterization is devoted to processing only 

systems, the bandwidth of the rasterizer's image memory is visible and nearly visible polygons. Interleaving, smart 

often a performance bottleneck. memory, and occlusion culling all improve performance in 

Traffic between the rasterizer and its image memory processing densely occluded scenes, and they can be used 

increases in approximate proportion to the depth complexity together or separately. 

of the scene. Consequently, frame rate decreases in approxi- 4Q While occlusion culling is new to hardware for 

mate proportion to depth complexity, resulting in poor z-buffering, it has been employed by software rendering 

performance for densely occluded scenes. algorithms. One important class of such techniques consists 

A second potential bottleneck is the bandwidth of the bus of hierarchical culling methods that operate in both object 
connecting the host and the rasterizer, since the description space and image space. Hierarchical object -space culling 
of the scene may be very complex and needs to be sent on 45 methods include the "hierarchical visibility" algorithm 
this bus to the rasterizer every frame. Although memory and which organizes scene polygons in an octree and traverses 
bus bandwidth has been increasing steadily, processor speed octree cubes in near-to-far occlusion order, culling cubes if 
has been increasing faster than associated memory and bus their front faces are occluded. A similar strategy for object- 
speeds, space culling that works for architectural scenes is to orga- 

Consequently, bandwidth limitations can become rela- 50 nize a scene as rooms wim "portals" (openings such as doors 

tively more acute over time. In the prior art, designers of windows), which permits any room not containing the 

hardware rasterizers have addressed the bottleneck between viewpoint to be culled if its portals are occluded, 

the rasterizer and its image memory in two basic ways: Both the hierarchical visibility method and the "rooms 

increasing image-memory bandwidth through interleaving and portals" method require determining whether a polygon 

and reducing bandwidth requirements by using smart 55 is visible without actually rendering it, an operation that will 

memory. be referred to as a visibility query or v-query. For example, 

Interleaving is commonly employed in high-performance whether an octree cube is visible can be established by 

graphics work stations. For example, the SGI Reality Engine performing v-query on its front faces, 

achieves a pixel fill rate of roughly 80 megapixels per The efficiency of these object-space culling methods 

second using 80 banks of memory. <>o depends on the speed of v-query, so there is a need to 

An alternative approach to solving the bandwidth problem provide fast hardware support, 

is called the smart memory technique. One example of this Hierarchical image-space culling methods include hierar- 

technique is the Pixel-Planes architecture. The memory chical z-buffering and hierarchical polygon tiling with cov- 

system in this architecture takes as input a polygon defined erage masks, both of which are loosely based on Warnock's 

. by its edge equations and writes all of the pixels inside the 65 recursive subdivision algorithm. 

polygon, so the effective bandwidth is very high for large With hierarchical z-buffering, z-buffer depth samples are 

polygons. maintained in a z-pyramid having NxN decimation from 
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level to level (see N. Greene, M. Kass, and G. Miller, 
"Hierarchical Z- Buffer Visibility," Proceedings of SIG- 
GRAPH '93, July 1993). The finest level of the z-pyramid 
is an ordinary z-buffer. At the other levels of the pyramid, 
each z-value is the farthest z in the corresponding NxN 5 
region at the adjacent finer level. To maintain the z-pyramid, 
whenever a z-value in the finest level is changed, that value 
is propagated through the coarser levels of the pyramid. 

Since each entry in the pyramid represents the farthest 
visible z within a square region of the screen, a polygon is 1° 
occluded within a pyramid cell if its nearest point within the 
cell is behind the corresponding z-pyramid value. Thus, 
often a polygon can be shown to be occluded by mapping it 
to the smallest enclosing z-pyramid cell and making a single 
depth comparison. 15 

When this test fails to cull a polygon, visibility can be 
established definitively by subdividing the enclosing image 
cell into an NxN grid of subcells and by comparing polygon 
depth to z-pyramid depth within the subcells. 

20 

Recursive subdivision continues in subcells where the 
polygon is potentially visible, ultimately finding the visible 
image samples on a polygon or proving that the polygon is 
occluded. Since this culling procedure only traverses image 
cells where a polygon is potentially visible, it can greatly 25 
reduce computation and z-buffer memory traffic, compared 
to conventional rasterization, which needs to traverse every 
image sample on a polygon, even if the polygon is entirely 
occluded. 

Hierarchical z-buffering accelerates v-query as well as 30 
culling of occluded polygons. 

Another algorithm that performs image -space culling 
with hierarchical depth comparisons is described by Latham 
in U.S. Pat. No. 5,509,110, "Method for tree-structured 
hierarchical occlusion in image generators/' April, 1996. 35 
Although Latham's algorithm does not employ a full-screen 
z-pyramid, it does maintain a depth hierarchy within rect- 
angular regions of the screen which is maintained by propa- 
gation of depth values. 

As an alternative to hierarchical z-buffering with a com- 40 
plete z-pyramid, a graphics accelerator could use a two -level 
depth hierarchy. Systems used for flight-simulation graphics 
can maintain a "zfar" value for each region of the screen. 

The screen regions are called spans and are typically 2x8 
pixels. Having spans enables "skip over" of regions where 45 
a primitive is occluded over an entire span. 

Another rendering algorithm which performs hierarchical 
culling in image space is hierarchical polygon tiling with 
coverage masks. If scene polygons are traversed in near- to- 5Q 
far occlusion order, resolving visibility only requires storing 
a coverage bit at each raster sample rather than a depth 
value, and with hierarchical polygon tiling, this coverage 
information is maintained hierarchically in a coverage pyra- 
mid having NxN decimation from level to level. 5S 

Tiling is performed by recursive subdivision of image 
space, and since polygons are processed in near-to-far 
occlusion order, the basic tiling and visibility operations 
performed during subdivision can be performed efficiently 
with NxN coverage masks. This hierarchical tiling method 6Q 
can be modified to perform hierarchical z-buffering by 
maintaining a z-pyramid rather than a coverage pyramid and 
performing depth comparisons during the recursive subdi- 
vision procedure. 

This modified version of hierarchical tiling with coverage 65 
masks is believed to be the fastest algorithm available for 
hierarchical z-buffering of polygons. However, for today's 
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processors, such software implementations of this algorithm 
are not fast enough to render complex scenes in real time. 

A precursor to hierarchical polygon tiling with coverage 
masks is Meagher's method for rendering octrees, which 
renders the faces of octree cubes in near-to-far occlusion 
order using a similar hierarchical procedure. 

The ZZ-buffer algorithm is another hierarchical rendering 
algorithm. Although it does not perform z-buffering, it does 
maintain an image-space hierarchy of depth values to enable 
hierarchical occlusion culling during recursive subdivision 
of image space. 

Yet another approach to culling has been suggested, one 
that renders a z-buffer image in two passes and only needs 
to shade primitives that are visible. In the first pass, all 
primitives are z-buffered without shading to determine 
which primitives are visible, and in the second pass, visible 
primitives are z-buffered with shading to producing a stan- 
dard shaded image. 

Although this suggested approach reduces the amount of 
work that must be done on shading, it is not an effective 
culling algorithm for densely occluded scenes because every 
pixel inside every primitive must be traversed at least once. 
In fact, this approach does not fall within an acceptable 
definition for occlusion culling, since it relies on pixel-by- 
pixel rasterization to establish visibility. 

The object-space and image-space culling methods, 
described above, can alleviate bandwidth bottlenecks when 
rendering densely occluded scenes. Suppose that a host 
computer sends polygon records to a graphics accelerator 
which renders them with hierarchical z-buffering using its 
own z-pyramid. 

Suppose, further, that the accelerator can perform v-query 
and report the visibility status of polygons to the host. With 
hierarchical z-buffering, occluded polygons can be culled 
with a minimum of computation and memory traffic with the 
z-pyramid, and since most polygons in densely occluded 
scenes are occluded, the reduction in memory traffic 
between the accelerator and its image memory can be 
substantial. 

Hierarchical z-buffering also performs v-query tests on 
portals and bounding boxes with minimal computation and 
memory traffic, thereby supporting efficient object-space 
culling of occluded parts of the scene. While hierarchical 
z-buffering can improve performance, today's processors 
are not fast enough to enable software implementations of 
the traditional algorithm to render complex scenes in real 
time. 

Thus there is a need for an efficient hardware architecture 
for hierarchical z-buffering. 

OBJECT AND BRIEF SUMMARY OF THE 
INVENTION 

It is an object of the present invention to provide a new 
and improved graphics system for rendering computer 
images of three-dimensional scenes. 

Briefly, the preferred embodiment separates culling of 
occluded geometry from rendering of visible geometry. 
According to the invention, a separate culling stage receives 
geometry after it has been transformed, culls occluded 
geometry, and passes visible geometry on to a rendering 
stage. This reduces the amount of geometric and image 
information that must be processed when rendering densely 
occluded scenes, thereby reducing memory and bus traffic 
and improving performance. 

BRIEF DESCRIPTION OF THE FIGURES 

FIG. 1 is a block diagram of the preferred embodiment of 
the invention. 
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FIG. 2 is an illustration of a z-pyramid organized in 4x4 visible geometry, so that culling operations are optimized 

tites- independently. According to this feature, a separate culling 

FIG. 3 is a flowchart of the method for rendering a list of stage in the graphics pipeline culls occluded geometry and 

polygons. passes visible geometry on to a rendering stage. 

FIG. 4 is an illustration showing the relationship of 5 nc cuUing stage ma i nta ins its own z-pyramid in which 

bounding boxes to the view frustum in model space. z _ vahies are stored at low precision io order t0 reduce 

FIG. 5 is a flowchart of the method for rendering frames storage requirements and memory traffic. For example, 

with box culling. z-values may be stored as 8 -bit values instead of the cus- 

HG. 6 is a flowchart of the method for sorting bounding tomary 24-bit or 32-bit values. 

k 0 ^i/° t0 ^ crS ' , . ... , j — , 4 . 10 Alternatively, occlusion information can be stored in 

wu. /is a nowenan on me meinoa lor processing a oaten ^ ^ wMch 

require less storage than a 

FIG. 8 if a flowchart of the method for tiling a list of T^f* °f °! , ■ 

polygons A second, independent method for reducing storage 

FIG. 9 is a flowchart of the method for geometric pro- is requirements and memory traffic is to use a low-resolution 

cessing of a polygon z-pyramid where each z-value in the finest level is a con- 

FIG. 10 is an illustration of a 4x4 tile showing its servative z " far vaIue for a S rou P of sam P les - 

coordinate frame. ^ e nove l algorithm presented herein involving hierar- 

FIG. 11 is a flowchart of the method for tiling a convex chical z " bufferin g * more efficient and more suitable for 

polygon 20 hardware implementation than algorithms that have been 

FIG. 12 is a flowchart of the method for reading an array ^ P*™" 1 * ^ algorithm performs z-buffer tiling 

of z values hierarchically on NxN regions or image space using a 

m/- n' n LJ fl1 LJ r • z-pyramid having NxN decimation from level to level to 

NxN ^ie 1$ a m Processing an stofe ^ dcpths of prcviously rendercd pdygons 

ctp . -11 * *■ a a *>t a , • i 25 At eacn cell encountered during hierarchical tiling of a 

FIG. 14 is an illustration a 4x4 tile and a triangle. ^ nn u, n „ n - • * j «= • *i 

„ . .„ . , ? polygon, conservative cullmg is performed very efficiently 

15 1S an ^^tration of nested coordinate frames. by the z . pyramid va i ue to me depth of the pkne 

FIG. 16 is a flowchart of the method for updating array 0 f me polygon. This routine hierarchically evaluates the line 

z ^ ar jf* and plane equations describing a polygon using a novel 

FIG. 17 is a flowchart of the method for propagating 30 algorithm that does not require general-purpose multiplica- 

z-values. tion (except for set-up computations). 

FIG. 18a is an illustration of a view frustum in model This evaluation method can also be applied to shading and 

space, interpolation computations that require evaluation of poly- 

FTG. 186 is an illustration of the coarsest 4x4 tile in a nomial equations at samples within a spatial hierarchy. The 

z-pyramid, 35 framework just described is particularly attractive for hard- 

FTG. 19 is a flowchart of a method for determining ware implementation because of its simplicity and compu- 

whether a bounding box is occluded by the "tip" of the tational efficiency and the fact that image memory is 

z-pyramid. accessed in NxN tiles during the read -compare -write cycle 

FIG. 20 is a block diagram of data flow within the culling for depth values, 

stage. 40 Definitions. 

FIG. 21 is a side view of a 4x4 tile in the z-pyramid. Culling procedures that may fail to cull occluded geom- 

HG. 22a is an illustration of a 4x4 tile covered by two ^ but never cull visible geometry are defined as conser- 

triangles. vative - 

FIG. 22b is an illustration of the coverage mask of triangle Z-buffering determines which scene primitive is visible at 

Q in FIG 22a 45 ca sample point on an image raster. 

mr« ii • ii * *• i • 1 Each sample point on the image raster is defined as an 

FIG. 22c is an illustration of the coverage mask of triangle . , j < . . , . „ , 

R in FIG 22a image sample, and the depth at an image sample is called a 

~1 . . , ^ ..... , . , , depth sample. 

FIG. 23 is a side view of a 4x4 tile in the z-pyrarmd and A Mcr maintains onc depth samp]e for cach point ifl 

two triangles that cover it. 5Q ^ image rastef If individual points in me ima rastef 

FIG. 24 is a schematic side view of a 4x4 tile in the corresp ond to individual pixels, it is referred to as point 

z-pyramid. sampling. 

FIG. 25 is a flowchart of the method for updating a An alternative is to maintain multiple depth samples 

mask-zfar tile record. within each pixel to permit antialiasing by oversampling and 

FIG. 26 is a side view of a c ell in the z-pyramid which 55 filtering, 

is covered by three polygons. A cell in the z-pyramid is the region of the screen 

FIG. 27 is an outline of the procedure for rendering corresponding to a value in the z-pyramid. Preferably, at the 

frames using frame coherence . finest level of the z-pyramid, cells correspond to depth 

FIG. 28 is a flowchart of the method of determining samples depths at pixels when point sampling and depths at 

whether the plane of a polygon is occluded within a cell. 60 subpixel samples when oversampling. At coarser levels of 

FIG. 29 is a flowchart of a "Create Look-Ahead Frame" the z-pyramid, cells correspond to square regions of the 

procedure. screen, as with image pyramids in general. 

nccpninTIAM _ NxN decimation from level to level of the z-pyramid is 

DETAILED r^CWFnON OF THE used NxN blocks of cd]s that m implicit ^ the stnlcturc 

INVENTION 65 of (he z _ p y ram j d are identified as tiles or NxN tiles. 

One of the key features in the preferred embodiment is to A Z-pyramid will sometimes be referred to simply as a 

separate culling of occluded geometry from rendering of pyramid. The term bounding box, sometimes shortened to 
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box, is applied to bounding volumes of any shape, including The total amount of storage required by the tow-precision 

the degenerate case of a single polygon (thus, the term z-pyramid in the culling stage is less than the total amount 

includes polygonal "portals" employed by some culling of storage required by the z-buffer in the rendering stage. For 

methods). example, if each z-value in a z-pyramid having 4x4 deci- 

Although the tiling algorithm described herein is adapted s mation is stored in 8 bits and each z-value in a z-buffer 

for z-buffering of polygons, z-buffering can also be applied having the same resolution is stored in 32 bits, the number 

to other types of geometric primitives, for example, quadric of bits in each z-value in the z-buffer is four times the 

surfaces. number of bits in each z-value in the z-pyramid, and the total 

The term primitive applies to all types of geometric bits of storage in the z-buffer is approximately 3.75 times the 

primitives including polygons. total bits of storage in the z-pyramid. 

As used herein, the term "object" (or "geometric object") If instead, each z-value in the z-pyramid is stored in 4 bits, 

is more general than the term "primitive" (or "geometric the number of bits in each z-value in the z-buffer is eight 

primitive"), since it may refer to a primitive, a bounding times the number of bits in each z-value in the z-pyramid, 

box, a face of a bounding box, and so forth. and the total bits of storage in the z-buffer is approximately 

A primitive, bounding box, or other geometric object is 7.5 times the total bits of storage in the z-pyramid. 

occluded if it is known to be occluded at all image samples 15 Within the culling stage 130, hierarchical z-buffering is 

that it covers, it is visible if it is known to be visible at one performed using a hierarchical tiling algorithm which 

or more image samples, and otherwise, it is potentially includes a hierarchical method for evaluating the linear 

visible. equations describing polygons according to the invention. 

For convenience, in some cases, visible and potentially The advantage of this hierarchical evaluation method is 

visible objects are collectively referred to as visible. 20 that it does not require general-purpose multiplication, 

enabling implementation with faster and more compact 

FIG. 1 illustrates a preferred embodiment of the present logic. These aspects of the invention will be described in 

invention in which the numeral 100 identifies a graphics more detail hereinafter. 

system for rendering geometric models represented by poly- To facilitate reading and writing in blocks, the z-pyramid 

gons. The graphics system includes a scene manager 110 is is organized preferably in NxN tiles, as illustrated in FIG. 2 

which sends scene geometry to a geometric processor 120. for a three-level pyramid 200 organized in 4x4 tiles. Each 

The geometric processor 120, in turn, transforms the tile is a 4x4 array of "cells," which are samples 202 at the 

geometry to perspective space and sends it on to a culling finest level of the pyramid and square regions of the screen 

stage 130, which culls occluded geometry and passes visible 206 at the other levels. 

polygons to a z-buffer rendering stage 140 which generates 30 4x4 tiles are preferred over other alternatives, such as 2x2 

the output image 150 which is converted to video format in or 8x8 tiles, because with 16 z-values, 4x4 tiles are large 

a video output stage 160, enough for efficient memory access and small enough that 

Both the culling stage 130 and the z-buffer Tenderer 140 the utilization of fetched values is reasonably high, 

have their own dedicated depth buffers, a z-pyramid 170 in Within the z-pyramid, tiles are "nested:" an NxN tile at 

the case of the culling stage 130 and a conventional z-buffer 35 the finest level corresponds to a cell inside its "parent tile" 

180 in the case of the z-buffer Tenderer 140. Preferably, the at the next-to-finest level, this parent tile corresponds to a 

z-buffer 180 and the finest level of the z-pyramid 170 have cell inside a "grandparent tile" at the adjacent coarser level, 

the same resolution and the same arrangement of image and so forth for all "ancestors" of a given tile, 

samples. For example, 4x4 tile 220 corresponds to cell 218 inside 

A "feedback connection" 190 enables the culling stage 40 parent tile 210, and tile 210 corresponds to cell 208 inside 

130 to report the visibility status of bounding boxes to the grandparent tile 216. In this example, tile 220 "corresponds 

scene manager 110 and, also, to send z-pyramid z-values to to" cell 218 in the sense that tile 220 and cell 218 cover the 

the scene manager 110. same square region of the screen. 

The culling stage 130 is optimized for very high- In FIG. 2, the image raster is a 64x64 array of depth 

performance culling by performing hierarchical z-buffering 45 samples 202 arranged in a uniform grid, only part of which 

using a dedicated z-pyramid 170 in which z-values are is shown to conserve space. 

stored at low precision (for example, 8 bits per z-vahe) in When point sampling, these depth samples correspond to 

order to conserve storage and memory bandwidth. a 64x64 array of pixels. Alternatively, when oversampling 

In addition to storing z-values at low precision, the culling with a 4x4 array of depth samples within each pixel, this 

stage 130 may also compute z-values at low precision to 50 image raster corresponds to a 16x16 array of pixels. Of 

accelerate computation and simplify computational logic. course, z-pyramids normally have much higher resolution 

Since z-values in the z-pyramid 170 are stored at low than illustrated in this example, 

precision, each value represents a small range of depths. Herein, as applied to a z-pyramid, the term resolution 

Therefore, visibility at image samples is not always estab- means the resolution of the z-pyramid's finest level, 

lished definitively by the culling stage 130. 55 The z-value associated with each cell of a z-pyramid is the 

However, computations within the culling stage 130 are farthest depth sample in the corresponding region of the 

structured so that culling is conservative, meaning that some screen. For example, in FIG. 2 the z-value associated with 

occluded geometry can fail to be culled but visible geometry cell 208 is the farthest of the 16 corresponding z-values in 

is never culled. Visibility at image samples is established tile 210 in the adjacent finer level and, also, is the farthest of 

definitively by the z-buffer Tenderer 140, since z-values 60 the 256 depth samples in the corresponding region of the 

within its z-buffer 180 are stored at full precision (e.g. 32 finest level 212 (this region is a 4x4 array of 4x4 tiles), 

bits per z-value). Thus, the finest level of the z-pyramid 200 is a z-buffer 

Because of the difference in depth-buffer precision containing the depth of the nearest primitive encountered so 

between the z-buffer 180 and the z-pyramid 170, some far at each image sample, and the other levels contain z-far 

potentially visible polygons sent from the culling stage 130 65 values, indicating the depths of the farthest depth samples in 

on to the z-buffer Tenderer 140 may not contribute visible the z-buffer within the corresponding square regions of the 

samples to the output image 150. screen. 
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Since a z-pyramid has a plurality of levels which are each Processing the boxes in a scene in near-to-far order 
a depth buffer, it can also be described as a hierarchical depth maximizes culling efficiency and minimizes computation 

buffer. and memory traffic. One way to facilitate near-to-far tra- 

Although the z-pyramid of FIG. 2 is organized in NxN versal is to organize polygons into a spatial hierarchy such 
tiles, in general, z-pyramid tiles are not necessarily square 5 a s an octree. However, building and maintaining a spatial 

and need not have the same number of rows and columns. hierarchy complicates the software interface and requires 

The illustrated structure of nested squares can be modified to additional storage. 

accommodate non-square images of arbitrary resolution by Another way to achieve favorable traversal order is to sort 
storing values for only cells within a rectangular region of *? oxes mto strict near-to-far order at the beginning of a 

each pyramid level. In FIG. 2 of the drawings, image 10 frame ' Howcvc t r > ^is method requires considerable compu- 

samples are arranged on a regular grid. Alternatively, Nation when there are numerous boxes. The preferred 

samples can be "jittered" to reduce aliasing. embodiment employs a unique ordering system that quickly 

The Scene Manager. SOr J? the boxes l ° t0 W ro f mate f order 

Ua „ to ha • • i . < . ~ The unique ordering system of the invention is illustrated 

The scene manager 110 * implemented in software run- in pjQ 4> 4 which shows ^ boundi box 40Q of aU ^ 

mng on a host processor. It reads the scene model from is ge0 metry within the model-space coordinate frame 402, the 

memory, maintains geometric data structures for the scene view frustum 404 (which h oriented M ^ four of {fs faces 

model, and initiates the flow of geometry through the are perpendicular to the page, for ease of illustration), six 

graphics system 100. It also initiates commands, such as bounding boxes labeled A-F, and nine "layers" L0, LI, ... , 

those that initialize the output image and depth buffers prior L8 defined by planes 406 that are parallel to the far clipping 

to rendering a frame (all values in the z-buffer 180 and 20 plane 408. 

z-pyramid 170 are initialized to the depth of the far clipping The planes 406 appear as lines in the illustration because 

plane), they are perpendicular to the page. The planes 406 pass 

The system is structured to operate with or without "box through equally spaced points (e.g. 410, 412) on the line 414 

culling" (culling of parts of the scene that are inside that is perpendicular to the far clipping plane 408 and passes 

occluded bounding boxes). Preferably, densely occluded 25 through the corner of model space 416 that is farthest in the 

scenes are rendered with box culling, since this accelerates "near direction," where the near direction is the direction of 

frame generation. the outward-pointing normal 418 to the "near" face 426 of 

Rendering a Scene without Box Culling. the view frustum 404. The plane through the nearest corner 

In this mode of operation, the scene manager 110 can send 416 of model space is called Pnear 424, where the "nearest 

all polygons in the scene through the system in a single 30 corner" of a box is the corner which lies farthest in the near 

stream. Each polygon in the stream is transformed to per- direction. 

spective space by the geometric processor 120, tiled into the Procedure Render Frames with Box Culling 500, illus- 
z-pyramid 170 by the culling stage 130 and, if not culled by trated in FIG. 5 of the drawings, is used to render a sequence 
the culling stage 130, z-buffered into the output image 150 of frames with box culling. In step 502, scene polygons are 
by the z-buffer renderer 140. This sequence of operations is 35 organized into bounding boxes, each containing some man- 
summarized in procedure Render Polygon List 300, shown ageable number of polygons (e.g., between 50 and 100). 
in the flowchart of FIG. 3. According to the procedure 300, The record for each box includes a polygon list, which 
the geometric processor 120 receives records for polygons may be a list of pointers to polygons rather than polygon 
from the scene manager 110 and processes them using records. If a particular polygon does not fit conveniently in 
procedure Transform & Set Up Polygon 900 (step 302), 40 a single box, the polygon's pointer can be stored with more 
which transforms each polygon to perspective space and than one box. Alternatively, the polygon can be clipped to 
performs "set-up" computations. the bounds of each of the boxes that it intersects. 

Transform & Set Up Polygon 900 also creates two records Next, step 504 begins the processing of a frame by 

for each polygon, a tiling record containing geometric clearing the output image 150, the z-pyramid 170, and the 

information that the culling stage 130 needs to perform 45 z-buffer 180 (z-values are initialized to the depth of the far 

hierarchical tiling, and a rendering record containing the clipping plane). 

information needed by the z-buffer renderer 140 to render Next, at step 505, viewing parameters for the next frame 

the polygon. The geometric processor 120 outputs these to be rendered are obtained. 

records to the culling stage 130. Then, procedure Sort Boxes into Layers 600 organizes the 
In step 304 of Render Polygon List 300, the culling stage 50 bounding boxes into "layers," the record for each layer 
130 processes these records using procedure Tile Polygon including the boxes whose "nearest comer" lies within that 
List 800, which tiles each polygon into the z-pyramid 170 layer. Sort Boxes into Layers 600 also makes a list of boxes 
and determines whether it is visible. For each visible that intersect the near face of the view frustum. Boxes on this 
polygon, the culling stage 130 sends the corresponding "near-box list" are known to be visible, 
rendering record on to the z-buffer Tenderer 140, which 55 Next, step 506 loops over all boxes on the near-box list 
renders the polygon into the output image 150 using con- and renders the polygon list of each box with Render 
ventional z-buffering (step 306). When all polygons have Polygon List 300. Next, step 508 processes layers in near- 
been processed, the output image is complete. to-far order, processing the boxes on each layer's list as a 
Procedures Transform & Set Up Polygon 900 and Tile "batch" with Process Batch of Boxes 700, which tests boxes 
Polygon List 800 will be described in more detail later, 60 for visibility and renders the polygons in visible boxes. 
Rendering a Scene with Box Culling. The advantage of processing boxes in batches rather than 
To render a scene with box culling, the scene is organized one at a time is that visibility tests on boxes take time, and 
in bounding boxes having polygonal faces. Before process- the more boxes that are tested at a time, the less the latency 
ing the geometry inside a box, the box is tested for per box. Actually, it is not necessary to process each layer as 
occlusion, and if it is occluded, the geometry contained in 65 a single batch, but when organizing boxes into batches, layer 
the box is culled. Box culling can accelerate rendering a lists should be utilized to achieve approximate near-to-far 
great deal. traversal. 
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When all boxes have been processed, the output image is of each box and reports its status to the scene manager 110 

complete so the image is displayed at step 510 and control on the feedback connection 190. When this visibility infor- 

returns to step 504 to process the next frame. mation is sent, the "tip" of the z-pyramid is also sent to the 

Procedure Sort Boxes into Layers 600, illustrated in FIG. sce ne manager 110 on the feedback connection 190. 

6 of the drawings, maintains a list of boxes for each layer, s Then, for each visible box, the scene manager 110 sends 

First, step 602 clears the near-box list and the list for each me box ' s of polygons out to be rendered, and if boxes are 

layer to the null list. While boxes remain to be processed nested, processes the "child" boxes that are inside each 

(step 604), step 606 determines the bounds of polygons visible box usin S this same Procedure. This cycle of 

within the box in the current frame. operations, which alternates between processing in v-query 

Actually, this is only necessary when the box contains 10 mo f wheD iC f tm ? boxes for visibility and processing in 

« • « i • *u i_ j c l * • • rendering mode when rendering scene polygons, continues 

moving polygons, since the bounds of boxes containing ^ ^ M ^ h&& been g readere £ ™ 

only static polygons can be computed before processing the Considering now the steps of procedure Process Batch of 

nrstirame. Boxes 700 (FIG. 7), in step 702, the scene manager 110 tests 

Next, step 608 determines whether the box lies outside the cach box io \ hc ba / ch tQ ^ occluded by * c ^ of ^ 

view frustum. One fast way to show that a box lies outside 15 z _py ra mid using procedure Is Box Occluded by Tip 1900, 

the view frustum is to show that it lies entirely outside a face wh ; ch be discussed later. Occluded boxes are removed 

of the frustum. This can be done by substituting one comer from the batch. Next, the scene manager 110 sends records 

of the box into the face's plane equation. for the front faces of each box in the batch to the geometric 

In FIG. 4, for example, the fact that box F's "nearest processor 120. 

corner" 422 lies outside the frustum's "far" face 408 estab- 20 Using procedure Transform & Set Up Polygon 900, the 

lishes that the box lies outside the frustum. The nearest geometric processor 120 transforms each face to perspective 

corners of the boxes are marked with a dot in FIG, 4. space and performs the other geometric computations 

If the box is determined to lie outside the frustum at step required to create the tiling record for the face, which is then 

608, control returns to step 604. Otherwise, step 610 deter- output to the culling stage 130 (step 704). While boxes 

mines whether the box intersects the "near" face of the view 25 remain to be processed (step 706), the visibility of each box 

frustum. If so, the box is added to the near-box list at step is established by the culling stage 130, which determines 

612 and control returns to step 604. whether its front faces contain at least one visible sample 

If the box does not intersect the near face of the view using procedure Tile Polygon List 800 operating in v-query 

frustum, control proceeds to step 614. Step 614 determines mode (step 708). 

the index L of the layer containing the box's nearest corner 30 If step 708 establishes that the box is visible, the corre- 
C using the following formula: L=floor(K*d/dfar), where K sponding "v-query status bit" is set to visible in step 710; 
is the number of layers, d is the distance from point C to otherwise, it is set to occluded in step 712. As indicated by 
plane Pnear 424, dfar is the distance from plane Pnear 424 step 706, this sequence of steps for processing boxes con- 
to the far clipping plane 408, and floor rounds a number to tinues until all boxes in the batch have been processed, 
the nearest smaller integer. 35 Then, step 714 sends the v-query status bits for the batch 

For example, in FIG. 4 z-far is labeled, as is depth d for of boxes from the culling stage 130 to the scene manager 110 

the nearest corner 420 of box E. In this case, the above on the feedback connection 190. Next, step 716 copies the 

formula would compute a value of 5 for L, corresponding to tip of the z-pyramid to the scene manager 110 on the 

layer L5. feedback connection 190. The "tip" includes the farthest 

Next, in step 616, the box is added to the list for layer L 40 z-value in the pyramid, the coarsest NxN tile in the pyramid, 

and control returns to step 604. When step 604 determines and perhaps some additional levels of the pyramid (but not 

that all boxes have been processed, the procedure terminates the entire pyramid, since this would involve too much 

at step 618. work). 

In FIG. 4, Sort Boxes into Layers 600 places boxes A and If the farthest z-value in the z-pyramid is nearer than the 

B in the near-box list, places box C into the list for layer L4, 45 depth of the far clipping plane maintained by the scene 

places boxes D and E into the list for layer L5, and culls box manager 110, step 716 resets the far clipping plane to this 

F. farthest z-value. Copying the tip of the pyramid enables the 

In practice, complex scenes contain numerous boxes and scene manager 110 to cull occluded boxes at step 702, as will 

layers typically contain many more boxes than in this be described later. 

example, particularly toward the back of the frustum, which 50 Next, the scene manager 110 checks the v-query status of 

is wider. Also, many more layers should be used than shown each box in the batch and initiates processing of the geom- 

in this example to improve the accuracy of depth sorting. etry inside each visible box (step 718). In step 720, the list 

Although the boxes in this example are rectangular solids, of polygons associated with a visible box is rendered with 

a box can be defined by any collection of convex polygons. procedure Render Polygon List 300. 

In summary, procedure Render Frames with Box Culling 55 According to procedure Render Frames with Box Culling 

500 is an efficient way to achieve approximately near-to-far 500, bounding boxes are not nested, but nested bounding 

traversal of boxes without sorting boxes into strict occlusion boxes can be handled with recursive calls to Process Batch 

order or maintaining a spatial hierarchy. of Boxes 700, as indicated by optional steps 722 and 724. If 

Processing a Batch of Boxes. there are "child" boxes inside the current box (step 722), in 

At step 508 of Render Frames with Box Culling 500, the 60 step 724 the scene manager 110 organizes these boxes into 

scene manager 110 organizes boxes into batches and calls one or more batches and processes each batch with this same 

procedure Process Batch of Boxes 700 (FIG. 7) to process procedure 700. 

each batch. Within Process Batch of Boxes 700, the scene Preferably, batches are processed in near-to-far order, 

manager 110 culls boxes which are occluded by the "tip" of since this improves culling efficiency. When all child boxes 

the z-pyramid and sends the remaining boxes to the geo- 65 have been processed (or if there are no child boxes), control 

metric processor 120, which transforms the boxes and sends returns to step 718, and when all visible boxes have been 

them to the culling stage 130, which determines the visibility processed the procedure 700 terminates at step 726. 
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Culling with the Z-Pyramid. 

Tile Polygon List 800 (FIG. 8) is the procedure used by 
the culling stage 130 to tile a list of polygons. The procedure 
800 receives as input from the geometric processor 120 the 
processing mode, either v-query or render, and a list of 5 
records for polygons. 

When in render mode the geometric processor 120 out- 
puts a tiling record for each polygon (geometric information 
that the culling stage 130 needs to perform hierarchical 
tiling) and a rendering record for each polygon (information 
needed by the z-buffer Tenderer 140 to render the polygon). 1 
When in v-query mode, the geometric processor 120 only 
outputs a tiling record for each polygon. 

Tile Polygon List 800 operates in render mode at step 304 
of procedure Render Polygon List 300, and it operates in 
v-query mode at step 708 of procedure Process Batch of 15 
Boxes 700. 

While polygons remain to be processed (step 802), Tile 
Polygon List 800 passes the processing mode, a tiling record 
and, if in render mode, a rendering record to Tile Convex 
Polygon 1100, the hierarchical tiling procedure employed by 20 
the culling stage 130. When in v-query mode, this procedure 
1100 just determines whether the polygon is visible with 
respect to the z-pyramid 170. 

When in render mode, the procedure 1100 updates the 
z-pyramid 170 when visible samples are encountered, and if 25 
the polygon is visible, outputs its rendering record to the 
z-buffer Tenderer 140. At step 804, if in v-query mode and 
the polygon is visible, step 806 reports that the polygon list 
is visible and the procedure terminates at step 808. 

Otherwise, the procedure returns to step 802 to process 30 
the next polygon. If the procedure 800 is still active after the 
last polygon in the list has been processed, if in v-query 
mode at step 810, step 812 reports that the polygon list is 
occluded and then the procedure terminates at step 814. 

Instead, if in render mode at step 810, the procedure 35 
terminates immediately at step 814. 
Tiling Records. 

Geometric computations are performed on polygons by 
the geometric processor 120 using procedure Transform & 
Set Up Polygon 900 (FIG. 9). This procedure 900 is 40 
employed in step 302 of procedure Render Polygon List 300 
and also in step 704 of Process Batch of Boxes 700. 

For each polygon, Transform & Set Up Polygon 900 
receives input from the scene manager 110 in the form of a 
record for the polygon before it has been transformed to 45 
perspective space, and for each polygon received, the pro- 
cedure 900 outputs a tiling record, and when in render mode, 
it also outputs a rendering record. 

First, step 902 transforms the polygon's vertices to per- 
spective space. Next, step 904 determines the smallest NxN 50 
tile in the pyramid that encloses the transformed polygon. 

For example, in FIG. 2 tile 210 is the smallest enclosing 
4x4 tile for triangle 214. (Triangle 214 is also enclosed by 
4x4 tile 216, but this tile is considered "larger" than tile 210 
because it is larger in screen area — it covers the whole 55 
screen, whereas tile 210 covers one-sixteenth of the screen.) 

Next, step 906 establishes the corner of the screen where 
the plane of the polygon is nearest to the viewer (i.e., farthest 
in the "near" direction). The method for computing this 
"nearest comer" will be described later, in connection with 60 
step 1308 of procedure 1300. 

Next, step 908 computes the equation of the plane of the 
polygon and the equation of each edge of the polygon. The 
coefficients in these equations are relative to the smallest 
enclosing NxN tile. 65 

Next, step 910 creates a tiling record for the polygon from 
the geometric information computed in the preceding steps 
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and outputs this record to the culling stage 130. If in render 
mode, step 910 also creates a rendering record for the 
polygon which contains the information needed by the 
z-buffer Tenderer 140 to render the polygon, and outputs this 
record to the culling stage 130. Following step 910, the 
procedure terminates at step 912. 

Geometric information computed for a polygon by Trans- 
form & Set Up Polygon 900 is stored in a tiling record 5000 
containing the following information. 
Tiling Record. 

1. level number and index of smallest enclosing tile 
("level," "index"); 

2. screen corner where plane of polygon is nearest 
("nearest_corner"); 

3. number of edges ("n"); 

4. coefficients (A^B.A), (A^A), . . . , (A b ,B n ,Cj of 
edge equations (polygon has n edges); and 

5. coefficients (A^B^Cp) of plane equation. 

The level number and index specify the tile in the 
z-pyramid ("index" is an array index). The numerical values 
of the coefficients of the edge and plane equations depend on 
the coordinate frame in which they are computed, and FIG. 
10 shows the "standard coordinate frame" that is used for an 
arbitrary 4x4 tile 1000. 

The origin of the coordinate frame is located at the tile's 
lower-left corner 1002, and the x and y axes 1004 are scaled 
so that the centers 1006 of cells 1008 correspond to odd 
integer coordinates and cell borders correspond to even 
integer coordinates. Thus, if an NxN tile is at the finest level 
of the pyramid and image samples are arranged on a uniform 
grid, the coordinates of image samples are the odd integers 
1, 3, 5, ... , 2N-1. If an NxN tile is not at the finest level, 
its cells are squares whose borders lie on the even integers 
0, 2, 4, ... , 2N. The fact that cell coordinates are small 
integer values simplifies evaluation of line and plane equa- 
tions. 

Each tile in the z-pyramid has an associated coordinate 
frame positioned and scaled relative to that tile as illustrated 
in FIG. 10. For example, FIG. 2 shows the coordinate frames 
(e.g. 222, 224) of the eight 4x4 tiles that would be traversed 
during hierarchical tiling of triangle 214. 
The Algorithm for Hierarchical z-buffering. 

Within Tile Polygon List 800, the procedure that hierar- 
chically z-buffers a convex polygon is Tile Convex Polygon 
1100 (FIG. 11). The input to this procedure 1100 is the 
processing mode, either render or v-query, a tiling record, 
and if in render mode, a rendering record. 

When in render mode, the procedure 1100 tiles the 
polygon into the z-pyramid 170, updates z-values when 
visible samples are encountered, and if the polygon is 
visible, outputs its rendering record to the z-buffer Tenderer 
140. 

When in v-query mode, the polygon is a face of a 
bounding box and the procedure U00 determines whether 
that face contains at least one visible image sample. When 
in v-query mode, the z-pyramid 170 is never written, and 
processing stops if and when a visible sample is found. 

Now, data structures maintained by Tile Convex Polygon 
1100 are described. The procedure 1100 maintains a stack of 
temporary tile records called the "Tile Stack," which is a 
standard "last in, first out" stack, meaning that the last record 
pushed onto the stack is the first record popped off. 

The temporary records in the Tile Stack contain the same 
information as the tiling records previously described, 
except that it is not necessary to include the polygon's 
"nearest corner," since this is the same for all tiles. 
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For each level in the pyramid, Tile Convex Polygon 1100 
maintains information about the z-pyramid tile within that 
level that was accessed most recently. Some of this infor- 
mation is relative to the tile currently being processed, the 
"current tile." The level record 5100 for level J of the s 
pyramid contains: 
level Record[J]. 

1. index of corresponding z-pyramid tile, call this tile "V 
("index[J]'0; 

2. NxN array of z-values for tile T ("z-arraytJf); 10 

3. farthest z-value in z-array[J], excluding cell containing 
"current tile" ("zfarJJf); 

4. TRUE/FALSE flag: Is z-array[J] different than 
z-pyramid record? ("dirty_flag[J]"); and 15 

5. TRUE/FALSE flag: Is tile T an ancestor of current tile? 
("ancestor_flag[J]"). 

As listed above, the level_record[J] contains the index 
for the corresponding tile "T" in the z-pyramid ("index[J]"), 
the NxN array of z-values corresponding to tile T ("z-array 20 
[J]"), the farthest z-value in z-array[J], excluding the depth 
of the cell containing the current tile ("zfarJVF] /' where 
subscript "x" alludes to this exclusion rule), a flag indicating 
whether the values in z-array[J] differ from the correspond- 
ing values in the z-pyramid ("dirty flag[J]"), and a flag 25 
indicating whether tile T is an "ancestor" of the current tile 
("ancestor_flag[J]" is TRUE if the current tile lies inside tile 

For example, assume that indexes 0, 1, . . . , F refer to the 
coarsest, next -to-coarsest, . . . , finest levels of the pyramid, 30 
respectively. In FIG. 2 of the drawings, while processing tile 
220, level_record[0] would correspond to the root tile 216, 
level_record[l] would correspond to tile 210 (since this 
would be the most recently accessed tile at level 1), and 
level_record[2] would correspond to tile 220. 35 

As for ancestor flags, ancestor_Jlag[0] would be TRUE, 
since tile 216 is the "grandparent" of tile 220 (in fact, 
ancestor__flag[0] is always TRUE), ancestor_Jlag[l] is 
TRUE since tile 210 is the "parent" of tile 220, and 
ancestor_flag[2] is FALSE, because a tile is not considered 40 
to be an ancestor of itself. 

According to the algorithm, which will be described later, 
while processing tile 220, zfar, values are computed for each 
pyramid level in order to facilitate propagation of z-values 
when visible samples are found. After processing tile 220, 45 
zfarjO] would be the farthest z-value in tile 216 excluding 
cell 208 (the cell that contains tile 220), zfarjl] would be 
the farthest z-value in tile 210 excluding cell 218 (the cell 
that contains tile 220), and zfar Je [2] would be the farthest of 
all the z-values in tile 220. Given these zfar* values, at each 50 
level of the pyramid, propagation of z-values only requires 
comparing one or two z-values, as will be described later. 
The Tiling Algorithm. 

Tile Convex Polygon 1100 starts with step 1102. If in 
v-query mode, step 1102 initializes the visibility status of the 55 
polygon to occluded. 

Next, step 1104 initializes the Tile Stack to the tiling 
record that was input. Ancestor_flags need to be computed 
when the tile stack is initialized at step 1104. While the Tile 
Stack is not empty (step 1106), step 1108 gets the record for 60 
the next tile to process (the "current tile") by popping it from 
the stack (initially, this is the tiling record that was input, 
which corresponds to the smallest enclosing tile). 

The level in the pyramid of the current tile is called "L." 
Step 1110 checks to see if the z-values for the current tile are 65 
already in z-array[L] (this can be established by comparing 
the current tile's index to index[L]). If not, procedure Read 
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Z-Array 1200 reads the z-values for the current tile from the 
z-pyramid 170 and puts them in z-array[L]. 

Next, Process NxN Tile 1300 processes each of the cells 
within the current tile, and if Lis not the finest level, for each 
cell where the polygon is potentially visible, appends a new 
record to the Tile Stack, as will be described later. 

At step 1112, if in v-query mode, control proceeds to step 
1114, where if the polygon's status is visible (this is deter- 
mined in Process NxN Tile 1300), the procedure terminates 
at step 1116, and otherwise, control returns to step 1106. 

If in render mode at step 1112, if L is the finest level of 
the pyramid and the changed flag is TRUE at step 1118 (this 
flag is set in Process NxN Tile 1300), step 1120 writes 
z-array[L] to the z-pyramid 170, Propagate Z- Values 1700 
"propagates" z-values through the pyramid (if necessary), 
and control returns to step 1106. 

If L is not the finest level of the pyramid or the changed 
flag is FALSE at step 1118, control returns directly to step 
1106. If the Tile Stack is empty at step 1106, hierarchical 
tiling of the polygon is complete and the procedure termi- 
nates at step 1122. If step 1122 is executed when in v-query 
mode, the polygon is occluded, but since the polygon's 
visibility status was initialized to occluded at step 1102, it is 
not necessary to set the status here. 

When in render mode, prior to returning at step 1122 the 
procedure 1100 can output additional information about a 
visible polygon to the z-buffer renderer 140. For example, if 
a polygon is being rendered with texture mapping and 
texture coordinates are computed during tiling, the bounding 
box of texture coordinates for the polygon could be output 
to inform the z-bufler renderer 140 which regions of a 
texture map will need to be accessed. 

Summarizing the role of the Tile Stack in Tile Convex 
Polygon 1100 when operating in render mode, the tile stack 
is initialized to a tiling record corresponding to the smallest 
tile in the z-pyramid that encloses the transformed polygon. 

Next, a loop begins with the step of testing whether the 
Tile Stack is empty, and if so, halting processing of the 
polygon. Otherwise, a tiling record is popped from the Tile 
Stack, this tile becoming the "current tile." 

If the current tile is not at the finest level of the pyramid, 
Process NxN Tile 1300 determines the cells within the 
current tile where the polygon is potentially visible, creates 
tiling records corresponding to the potentially visible cells 
and pushes them onto the Tile Stack, and then control returns 
to the beginning of the loop. If the current tile is at the finest 
level of the pyramid, Process NxN Tile 1300 determines any 
visible samples on the polygon, and if visible samples are 
found, the z-pyramid is updated. Then, control returns to the 
beginning of the loop. 

The basic loop is the same when in v-query mode except 
that when a visible sample is encountered, the procedure 
reports that the polygon is visible and then terminates, or if 
an empty Tile Stack is encountered, the procedure reports 
that the polygon is occluded and then terminates. 

Procedure Hie Convex Polygon 1100 performs hierarchi- 
cal polygon tiling and hierarchical v-query of polygons by 
recursive subdivision. The Tile Stack is the key to imple- 
menting recursive subdivision with a simple, efficient algo- 
rithm that is well suited for implementation in hardware. 

The procedure finishes processing one NxN tile before 
beginning another one, and reads and writes z-values in 
NxN blocks. These are not features of prior-art software 
implementations of hierarchical tiling, which use depth-first 
traversal of the pyramid, processing all "children" of one 
cell in a tile before processing other cells in the tile. 

Thus, with prior-art software methods, the "traversal tree" 
describing the order in which z-pyramid tiles are traversed 
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is topologically different than with the tiling algorithm right, the +x,-y quadrant is lower right, the -x,-y quadrant 

presented herein, which is better suited to implementation in is lower left, and the -x,+y quadrant is upper left) 

hardware. To help in visualizing this, the normal vector 1406 

The following describes the three procedures called by attaches to the center of the back of the triangle 1400, points 

Tile Convex Polygon 1100: Read Z-Array 1200, Process s mt0 me page, and the dashed portion is occluded by the 

NxN Tile 1300, and Propagate Z-Wues 1700. triangle 1400. Step 906 of Transform & Set Up Polygon 900 

Procedure Read Z-Array 1200 (FIG. 12) reads the NxN uses t his compute the polygon's nearest corner, 

3ITay ° ™}™? n ? V °^* a tde specified by !te level whjch fc fc ^ ^ * 

number ("L") and index ( I ) from the z-pyramid 170 into Trt „„„ «. + , , . e , 

M ay[L].Atstcpl2a2idirty^g[L]israUE(m^ n . f ™ ■ °onnal vector is forward-pomUng 

that the values in z-array[L] have been modified), step 1204 10 mstc * d of backward-pointing, a cell s nearest corner corre- 

writes z-array[L] to the z-pyramid 170, writes I to index[L], ^ on6 \ to the 1 uadrant of vector (~nx,-ny) instead of vector 

sets dirty_flag[L] to FALSE, and sets ancestor_flag[L] to ( nx > n y)- 

TRUE. Th e next ste P is to compute the depth of the plane of the 

Next, whether or not step 1204 was executed, step 1206 P olv gon at the nearest corner of the current cell, called the 

reads z-values for the specified tile from the z-pyramid 170 15 Pane's znear value within the cell, by substituting the 

into z-array[L], and the procedure terminates at step 1208. corner's x and y coordinates into the polygon's plane 

Processing of Tiles. equation, which has the form z=Ax+By+C, where x and y 

Process NxN Tile 1300 (FIG. 13) loops over each of the are even integers. Actually, this equation is evaluated 

NxN cells within a tile, processing them in sequence, for hierarchically, as will be explained later, 

example by looping over the rows and columns of cells 20 Next, the plane's znear value is compared to the zfar value 

within the tile. The tile's level number in the pyramid is storcd in z-array[L] that corresponds to the current cell, and 

called "L" and the cell currently being processed will be $ tne vaiue ^ farther than the zfar value, the plane of 

called the "current cell." tne polygon is occluded within the current cell and control 

If L is the finest level and in render mode, step 1302 sets proceeds to step 1312. Otherwise, control proceeds to step 

a flag called changed to FALSE, sets a variable called 25 1310 - 

zfar_finest to the depth of the near clipping plane, and sets Th e depth comparison described above is the only occlu- 

all values in array zfar x to the depth of the near clipping sion test performed on a polygon with respect to a given cell, 

plane. While cells remain to be processed (step 1304), if L sin gle occlusion test is not definitive when the nearest 

is the finest level and in render mode, step 1306 updates corner of the cell lies outside the polygon, 

array zfar, using procedure Update zfar x 1600. 30 In ^is case, rather than perform further computations to 

Occlusion Test. establish visibility definitively, the occlusion testing of the 

Next, step 1308 determines whether the plane of the polygon with respect to the cell is halted and visibility is 

polygon is occluded within the current cell. The polygon's resolved by subdivision. This culling method is preferred 

plane equation, which is stored in the tiling record, has the because of its speed and simplicity. 

form: 35 The steps of the above method for testing a polygon for 

z^c+By+c occlusion within a cell covering a square region of the screen 

are summarized in the flowchart of FIG. 28, which describes 

If the current cell corresponds to an image sample, the the steps performed at step 1308 when the current cell 

depth of the polygon is computed at this sample by substi- corresponds to a square region of the screen (rather than an 

tuting the sample's x and y coordinates into the polygon's 40 image sample). 

plane equation. First, step 2802 determines the corner of the cell where 

If the polygon's depth at this point is greater than the the plane of the polygon is nearest using the quadrant of 

corresponding z-value stored in z-array[L] (which is main- vector (nx,ny), where (nx,ny,nz) is a backward-pointing 

tained in Tile Convex Polygon 1100), this sample on the normal to the polygon (or if the normal is forward-pointing, 

polygon is occluded, and control proceeds to step 1312. At 45 the quadrant of vector (-nx,-ny) is used instead), 

step 1312, if at the finest level of the pyramid and in render Next, step 2804 computes the depth of the plane at that 

mode, if the z-value in z-array[L] which corresponds to the "nearest corner," i.e., the plane's znear value. At step 2806, 

current cell is farther than variable zfar_finest, variable if the plane's znear value is farther than the z-value for the 

zfar_finest is overwritten with that z-value. Following step cell stored in the z-pyramid, step 2808 reports that the plane 

1312, control returns to step 1304. so (and hence the polygon) is occluded and the procedure 

At step 1308, if the current cell corresponds to a square terminates at step 2812. 

region of the screen (rather than an image sample), the Otherwise, step 2810 reports that the plane (and hence the 

nearest point on the plane of the polygon within that square polygon) is potentially visible and no further occlusion 

is determined. This is done by evaluating the plane equation testing is performed for the polygon with respect to the cell, 

at the corner of the cell where the plane is nearest to the 55 Following step 2810, the procedure terminates at step 2812. 

viewer. Examples of occlusion tests performed by procedure Is 

This "nearest corner" can be determined easily from the Plane Occluded within Cell 2800 are illustrated in FIG. 26, 

plane's normal vector using the following method, which is which shows a side view of a cell in a z-pyramid, which in 

illustrated in FIG. 14. three dimensions is a rectangular solid 2600 having a square 

Suppose that triangle 1400 is being processed within cell 60 cross-section, Given the indicated direction of view 2602, 

1402 of tile 1404, and vector 1406 is a backward-pointing the right-hand end 2604 of the solid 2600 is the near clipping 

normal vector (nx,ny,nz). Then the corner of the cell 1402 plane and the left-hand end 2606 of the solid 2600 is the far 

corresponding to the "quadrant" of vector (nx,ny) indicates clipping plane. 

the corner where the plane of the polygon is nearest to the The bold vertical line indicates the current z-value 2608 

viewer. 65 stored in the z-pyramid cell. The three inclined lines, 2610, 

In this instance, the "nearest corner" is 1408, since nx and 2620, and 2630, indicate the positions of three polygons, 

ny are both negative. (In general, the +x,+y quadrant is upper each covering the cell and each oriented perpendicular to the 
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page to simplify illustration. For each polygon, the znear and Step 1310 determines whether the current cell lies outside 

zfar values of its plane within the cell are shown by dashed any edge of a polygon by using this method to compare the 

lines. cell to each edge. 

Procedure Is Plane Occluded within Cell 2800 would This method is not a definitive cell-polygon intersection 

show that polygon 2610 is occluded at the illustrated cell $ tcst > but it is simple and conservative, never culling a cell 

because the znear value 2612 of the polygon's plane is containing a visible image sample. If the current cell is 

farther than the cell's z-pyramid value 2608. Procedure Is outside any edge, control proceeds to step 1312. Otherwise, 

Plane Occluded within Cell 2800 would show that polygon control proceeds to step 1314. 

2620 is potentially visible at the illustrated cell because the Step 1308 and each of the "outside-edge" tests of step 

znear value 2622 of the polygon's plane is nearer than the 1310 can all be done in parallel. 

cell's z-pyramid value 2608. At step 1314, if L is the finest level, a visible image 

It is preferable that z-values within the z-pyramid 170 are sample has been found, and control proceeds to step 1316. 

stored at low-precision (e.g., in 8 bits), and this complicates If in v-query mode at step 1316, step 1318 sets the 

depth comparisons slightly. A low-precision z-value can be polygon's visibility status to visible and the procedure 

thought of as representing a small range of z-values in the terminates at step 1320. If not in v-query mode, if the 

interval [near far]. 15 polygon's rendering record has not yet been output to the 

If the plane's znear value computed at step 1308 is farther z-buffer Tenderer 140, this is done at step 1322. 

than far, the plane is occluded within the cell, and if znear In the preferred embodiment of the invention, the reso- 

is nearer than near, the plane is visible within the cell. But lution of the finest level of the z-pyramid 170 is the same as 

if znear is between near and far it cannot be determined mc resolution of the image raster. However, it is also 

whether the plane is visible within the cell. 20 possible to use a low-resolution z-pyramid. This option and 

In this last case, it is assumed that the polygon is visible associated steps 1334 and 1336 will be described in more 

so that culling will be conservative, never culling a polygon detail later. 

containing a visible image sample. This same analysis is Assuming a full-resolution z-pyramid 170, following step 
applied in the other conservative culling procedures dis- 1322 > ste P 1326 xts tne changed flag to TRUE. Next, step 
cussed herein when depth comparisons involving low- 25 1328 updates zfar__finest, a variable that keeps track of the 
precision z-values are performed. farthest z-value encountered thus far within the current tile. 
Overlap Tests. Accordingly, if the z-value computed for the current cell at 
At step 1310 of procedure 1300, the objective is to ste P 1308 is farther than zfarjinest, zfar_finest is over- 
determine whether the current cell and the polygon overlap written with that z-value. 

on the screen. 30 Next, step 1330 writes the z-value computed for the 

There can be no overlap where the current cell lies polygon at step 1308 to the appropriate entry in z-array[F] 

entirely outside an edge of the polygon. For each of the (where F is the index of the finest pyramid level), 

polygon's edges, it is determined whether the current cell 11 is possible to update the z-pyramid 170 directly at this 

lies outside that edge by substituting the appropriate point stc P> but to improve efficiency, preferably, the z-pyramid is 

into its edge equation, which has the form: 35 read and written in records for NxN tiles. 

Ax+By+c-o According to the preferred embodiment of the present . 

invention (FIG. 1), shading is not performed in the stage of 

If the current cell corresponds to an image sample, the the graphics system that is presently being described, but it 

"appropriate poinf ' is that image sample. is possible to do so. For example, it is possible to combine 

In FIG. 10, assume that tile 1000 is at the finest level of 40 the culling stage 130 and its z-pyramid 170 with the z-buffer 

the pyramid and the half-plane 1012 lying outside edge 1010 Tenderer 140 and its z-buffer 180 into a single stage: a 

is defined by the inequality Ax+By+C<0. Coefficients A, B, hierarchical z-buffer Tenderer with a z-pyramid. 

and C in this inequality (which were computed at step 908 With this architecture, step 1332 would compute the color 

of procedure 900) are computed relative to the tile's coor- of the image sample and then overwrite the output image, 

dinate frame 1004, and image samples within the tile have 45 Also, step 1322 would be omitted (as would step 1340), 

odd integer coordinates. since there would no longer be a separate rendering stage. 

To determine whether an image sample lies outside an Step 1332 is shown in a dashed box to indicate that it is an 

edge, its x and y coordinates are substituted into the edge's option and not the preferred method, 

equation and the sign of the result is checked. Step 1310 Whether or not pixels are shaded in this procedure 1300, 

performs this test on each edge of the polygon (or until it is 50 control returns to step 1304. 

determined that the sample lies outside at least one edge). If At step 1314, if L is not the finest level, control proceeds 

the sample is outside any edge, control proceeds to step to step 1338, which is an optional step (as indicated by its 

1312. Otherwise, control proceeds to step 1314. depiction in dashed lines). If in render mode, step 1338 

At step 1310, if the current cell corresponds to a square computes the maximum amount that continued tiling within 

region of the screen (rather than an image sample), it must 55 the current cell can advance z-values in the pyramid, which 

be determined whether that square lies entirely outside an is the difference between the znear value of the polygon's 

edge of the polygon. For each edge, this can be done by plane computed at step 1308 and the z-value stored for the 

substituting the coordinates of a single comer point of the current cell in z-array[L]. 

current cell into the edge equation, using the comer that is If the maximum "z advance" is less than some specified 

farthest in the "inside direction" with respect to the edge. 60 positive threshold value, call it zdelta, the current cell is not 

In FIG. 10, the inside direction for edge 1010 is indicated subdivided and the polygon is assumed to be visible. In this 

by arrow 1018, the corner of cell 1022 that is farthest in the case, control proceeds to step 1340, which outputs the 

inside direction is corner 1020, and substituting the corner's polygon's rendering record to the z-buffer Tenderer 140, if 

x and y coordinates into the equation for edge 1010 shows this has not already been done, after which control returns to 

that comer 1020 and cell 1022 lie outside of edge 1010. The 65 step 1304. 

corner points of cells have even integer coordinates, (2,2) in In FIG. 26 the bold dashed line 2640 shows the z-pyramid 

the case of point 1020. value 2608 for a cell offset in the near direction by zdelta. 
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Since the znear value 2622 for polygon 2620 is farther than 
this offset z-pyramid value 2640, tiling of polygon 2620 
would stop within the illustrated cell, since the maximum 
amount that continued tiling could advance the z-pyramid 
value for the cell is less than zdelta. On the other hand, tiling 
of polygon 2630 would continue, since its znear value 2632 
is nearer than the offset z-pyramid value 2640. 

Although step 1338 can decrease the culling efficiency of 
the z-pyramid, it also reduces the amount of tiling the culling 
stage 130 needs to do, and in some cases, this is a good 
trade -off, improving the overall performance of the system. 

If step 1338 is not employed or if its conditions are not 
satisfied, control proceeds to step 1342. Steps 1342 and 1344 
create the tiling record for a new NxN tile corresponding to 
the current cell, this record including new coefficients for the 
polygon's edge and plane equations. 

Step 1342 "transforms" the current tile's edge and plane 
equations so that their coefficients are relative to the coor- 
dinate frame of the new tile, using a method that will be 
described later. If tiling records also include the coefficients 
of shading equations, these equations are also transformed. 

Step 1344 computes the level number and index of the 
new tile, creates a tiling record for the tile, and pushes this 
record onto the Tile Stack. Following step 1344, control 
returns to step 1304. 

When all cells within the tile have been processed at step 
1304, the procedure terminates at step 1346. 

Although procedure Process NxN Tile 1300 processes 
cells one by one, it is also possible to process cells in 
parallel, for example, by processing one row of cells at a 
time. 

Hierarchical Evaluation of Line and Plane Equations. 

Before describing the hierarchical evaluation method 
employed by the invention, the underlying problem will be 
described. When z-buffering a polygon, it is necessary to 
evaluate the linear equations defining the polygon's edges 
and plane. 

Edge equations have the form Ax+By+C-0 and plane 
equations are expressed in the form z=Ax+By+C. When 
performing hierarchical z-buffering, these equations must be 
evaluated at points on tiles in the image hierarchy. 

Each of these equations includes two additions and two 
multiplications so direct evaluation is relatively slow, and if 
evaluation is performed with dedicated hardware, the cir- 
cuitry required to perform the multiplications is relatively 
complex. 

Efficient evaluation of these equations is the cornerstone 
of various prior-art algorithms for z-buffering polygons. 
However, prior-art methods are not particularly efficient 
when a polygon covers only a small number of samples, as 
is the case when tiling is performed on tiles of an image 
hierarchy, and they do not take advantage of coherence that 
is available in an image hierarchy. 

Thus, there is a need for a more efficient method for 
evaluating the linear equations defining a polygon within 
tiles of an image hierarchy. 

The novel method employed by the invention achieves 
efficiency by evaluating line and plane equations 
hierarchically, as will be described now. 

Within Process NxN Tile 1300, at every cell it is neces- 
sary to evaluate a plane equation of the form z»Ax+By+C at 
step 1308 and edge equations of the form Ax+By+C=0 at 
step 1310. Coefficients A, B, and C are computed relative to 
the standard coordinate frame of FIG. 10, and the advantage 
of this approach is that the values of x and y in the equations 
are small integers, which permits the equations to be evalu- 
ated with shifts and adds, rather than performing general- 
purpose multiplication. 
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For example, while looping over cells within a tile, 
equation z-Ax+By+C can be computed incrementally as 
follows: at (0,0) z=C, at (2,0) z=C+2A, at (4,0) Z-C+4A, and 
so forth. Even when incremental methods are not used, the 
equations can be evaluated efficiently with shifts and adds. 

For example, if x is 5, the term Ax can be computed by 
adding A to 4A, where 4A is obtained by shifting. 

At step 1342 of Process NxN Tile 1300, new coefficients 
of edge and plane equations are computed when cells are 
subdivided. The objective is to transform a linear equation of 
x and y from the coordinate frame of an NxN tile to the 
coordinate frame of cell (xt,yt) within it. 

More particularly, in FIG. 2 consider cell 218 within 4x4 
tile 210, which corresponds to 4x4 tile 220 at the adjacent 
finer level of the pyramid. The relationship in screen space 
between the (x,y) coordinate frame 222 of cell 210 and the 
(x^y') coordinate frame 224 of cell 220 is shown in FIG. 15. 

Relative to coordinate frame 222, coordinate frame 224 is 
translated by vector (xt,yt), in this case (6,4), and scaled by 
a factor of four (and in general for an NxN tile, a factor of 
N). 

When the tiling record for triangle 214 is created by 
procedure Transform & Set Up Polygon 900, step 908 
computes coefficients (A,B,C) in the edge equation Ax+By+ 
C=0 for edge 1502 relative to coordinate frame (x,y) of tile 
210 (this is the smallest enclosing tile). When tile 210 is 
subdivided and a record for tile 220 is created, this edge 
equation is transformed to edge equation A'x'+B'y'+C'-O, 
which is relative to coordinate frame (x',y') of tile 220. 

New coefficients (A\B\C) are computed using the fol- 
lowing transformation formulas 4000, which are applied to 
edge and plane equations at step 1342 of procedure 1300: 

B'=BIN 



C=Axt+Byt+C. 

Assuming that N is a power of two, A' and B 1 can be 
40 obtained by shifting. Frequently, Ax+By+C has already been 
evaluated at (xt,yt) at step 1308 or 1310 of procedure 1300, 
in which case C is already known. Whether or not this is 
exploited, C can be efficiently computed since xt and yt are 
smalt integers. 

45 Thus, computing new coefficients for the line and plane 
equations is done very efficiently at step 1342 of procedure 
1300, without performing general-purpose multiplication. 

The same transformation formulas 4000 can be applied to 
any linear equation of the form w=Ax+By+C including edge 

50 equations, plane equations, and equations used in shading. 
If shading is performed during hierarchical tiling at step 
1332 of procedure 1300, the method can be applied to 
interpolating vertex colors of triangles (i.e., performing 
Gouraud shading). In this case, the intensities of the red, 

55 green, and blue color components can each be expressed as 
a linear equation (e.g. red-Ax+By+C) and evaluated in the 
same way as z-values. 

Since both sides of an equation can be multiplied by the 
same quantity, equation w=Ax+By+C is equivalent to equa- 

60 tion Nw=N(Ax+By+C). Hence, using the following trans- 
formation formulas 4001 would result in computing Nw 
rather than w: 



65 



B'-B 



C'»N(Axt+Byt+C). 
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In this case, coefficients A and B are unchanged but it is 
necessary to compute w from Nw by shifting (unless only 
the sign of the equation must be determined, as is the case 
when evaluating an edge equation). 

Regardless of whether formulas 4000 or formulas 4001 
are employed, transforming a linear equation from the 
coordinate frame of one tile to the coordinate frame of a 
"child" tile involves translation and scaling computations, 
where scaling is performed by shifting. With formulas 4000, 
scaling is performed by shifting coefficients A and B of the 
equation, and with formulas 4001, scaling is performed by 
shifting Axt+Byt+C, which is a linear expression of the 
coefficients of the equation. 

This method for hierarchical evaluation of linear equa- 
tions can also be applied in higher dimensions. For example, 
3D tiling of a convex polyhedron into a voxel hierarchy 
having NxNxN decimation could be accelerated by hierar- 
chical evaluation of the plane equations of the polyhedron's 
faces, which each have the form Ax+By+Cz+D=0. For cell 
(xt,yt,zt) within an NxNxN tile, the transformed coefficients 
of this equation are: 

A'mAJN 
B'-B/N 
C-ON 

D'=Axr+Byt+Czt+D, 
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or equivalently, 
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C"=C 



D~N(Axt+Byt+Czt+D). 



This method of hierarchical evaluation can be applied to 
evaluate higher-degree polynomial equations. For example, 
the general equation for a conic section (ellipse, parabola, or 
hyperbola) is Ax^Bxy+C^+Dx+Ey+FM). For cell (xt,yt) 
within an NxN tile, the transformed coefficients of this 
equation are: 

A'-AfN 2 



D'-(2Axt+Byt+D)fN 
E'~(2Cyt+Bxt+E)/N 
F 'mAxP+Bxtyt+Cyf+Dxt+Eyt+F, 

or equivalently, 

A'-A 
B'-B 
C'-C 

D'-N{2Axt+Byt+D) 
E'-N(2Cyt+Bxt+E) 
F'^fAx^+Bxtyt+CyP+Dxt+Eyt+F). 

Evaluation of these equations can be accelerated by 
computing some or all of the terms with shifting and 
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addition, rather than multiplication. As with transforming 
linear equations, the above transformation formulas perform 
translation and scaling computations, and scaling is accom- 
plished by shifting (shifting either a single coefficient or a 
polynomial expression of coefficients, such as expression 
2Axt+Byt+D in the formula above). 

The hierarchical evaluation methods described above can 
be applied when the image raster has jittered samples by 
scaling up the coordinate frame of the tiles. For example, if 
the coordinate frame of a 4x4 tile is scaled up by a factor of 
4, there would be 32 integer values across the tile instead of 
8, and the x and y coordinates of jittered image samples 
could have any of these values. 

In summary, the hierarchical evaluation methods 
described above can be applied to accelerating processing of 
geometric objects described by polynomial equations within 
a spatial hierarchy (e.g., an image pyramid, octree, quadtree, 
etc.) that is organized in nested tiles that progress in scale by 
powers of two. 

The method transforms a polynomial equation (e.g., a 
linear or quadratic equation of x and y) from the coordinate 
frame of one tile to the coordinate frame of a "child" tile at 
the adjacent finer level of the hierarchy. This transformation 
is performed by translation and scaling computations, where 
scaling is performed by shifting the binary representation of 
the equation's coefficients or by shifting the binary repre- 
sentation of a polynomial expression of the equation's 
coefficients. 

Shifting can be used to scale numbers represented in 
floating-point format, in addition to numbers represented in 
integer format. The advantage of this method of hierarchical 
evaluation is that evaluation can often be done without 
performing general-purpose multiplication, thereby acceler- 
ating computation and simplifying the required circuitry. 

Hierarchical evaluation of equations can be applied to a 
variety of tiling, shading, and interpolation computations 
which require evaluation of polynomial equations at samples 
within a spatial hierarchy. The method is well suited to 
implementation in hardware and it works well in combina- 
tion with incremental methods. 
Propagation of Z- Values. 

While looping over cells within a finest-level tile, Process 
NxN Tile 1300 determines zfarJL] at each pyramid level L 
and the tile's zfar value (zfar_finest). Given this 
information, propagation can usually be performed with 
only one or two depth comparisons at each level of the 
pyramid (actually, this is only possible at levels where the 
ancestor_flag is TRUE, but this usually is the case). 

The prior-art method of performing propagation during 
hierarchical z-buffering requires performing N 2 depth com- 
parisons for NxN tiles at each level of propagation. The 
method described herein accelerates propagation by reor- 
dering most of these depth comparisons, performing them 
during tiling. 

Another advantage of maintaining zfar, values is that 
when propagation to an ancestor tile is not necessary, this 
can be determined without accessing z-values for the ances- 
tor tile. 

Suppose that ZFAR[F] is the farthest z-value within the 
current tile C in the finest level (where F is the index of the 
finest level), ZFAR[F-1] is the farthest z-value within the 
parent tile of the current tile, and so forth. Then the farthest 
z-values within ancestor tiles can be computed from zfar_ 
finest and the values in array zfar x as follows: 

ZFAR[F]=zfar_finest (zfar within C), 

ZFAR[F-l]«farthest of (ZFAR[F],zfarJF-l]) (zfar 
within parent of C), 
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ZFAR[F-2]=farthest of (ZFARfF-lkzfarJT-2]) (zfar computed in parallel, for example, if tiles were processed 

within grandparent of C), row-by-row instead of cell-by-cell, 

and so forth. When a new value of variable pyramid_zfar is estab- 

Propagation can stop whenever it fails to change the lished at step 1724, the far clipping planes maintained by the 

existing value in an ancestor tile. The actual algorithm used 5 scene manager 110 and the z-buffer 180 can be reset to this 

to perform propagation will be presented after discussing nearer value. 

procedure Update zfar^ 1600 (FIG. 16), which maintains Variable pyramid_zfar is part of the tip of the z-pyramid 

array zfar^. which is copied to the scene manager 110 at step 716 of 

Procedure Update zfar,. 1600 is called at step 1306 of procedure 700. The scene manager 110 uses pyramid_zfar 

Process NxN file 1300 to update zfar^ values. The proce- 10 to reset the far clipping plane, and it uses pyramid_zfar and 

dure receives as input the index "I" of the current cell within other copied depth values to cull occluded bounding boxes, 

the current tile. as described below. 

Step 1602 initializes variable "L" to the finest level of the Culling with the Tip of the Z-Pyramid. 

pyramid. When culling boxes with a z-pyramid, occlusion can 

Next, at step 1604, if the z-pyramid cell with index I in 15 sometimes be detected with a single depth comparison. 

z-array[L] (i.e., z-array[L][I]) covers the current tile, control However, when culling is performed with procedure Process 

proceeds to step 1610. Otherwise, at step 1606, if z-array Batch of Boxes 700, culling an occluded box requires 

[L][I] is farther than the current value of zfarJL], zfarJTJ is transforming the box's front faces to perspective space, 

set equal to z-array[L][I], and then control proceeds to step processing them with the culling stage 130, and reporting 

1610. 20 results to the scene manager 110. 

At step 1610, if L is the coarsest level, the procedure To avoid the latency caused by these steps, an alternative 

terminates at step 1612. Otherwise, step 1614 sets L to the is for the scene manager 110 to maintain some z-pyramid 

index of the adjacent coarser level and control returns to step values and cull a box if it (or its bounding sphere) is 

1604. occluded by a z-pyramid cell. Only if occlusion cannot be 

At any level L where ancestor_Jlag[L] is FALSE, zfarJL] 25 detected at this stage is a box sent through the rest of the 

is not a valid value and it will need to be recomputed later, system. 

but this is a relatively rare event. Although the method just According to the method of the invention, after v-query 
described computes zfar,. values one by one, all values can results are reported to the scene manager 110 on the feed- 
be computed in parallel. back connection 190 at step 714 of Process Batch of Boxes 

The propagation procedure, Propagate Z- Values 1700 30 700, step 716 copies the tip of the z-pyramid 170 to the scene 

(FIG. 17), is called after step 1120 of Tile Convex Polygon manager 110. The "tip" includes the zfar value for the entire 

1100. Step 1702 initializes variable L to the finest level of z-pyramid (i.e., pyramid_zfar), the coarsest NxN tile in the 

the pyramid and variable K to the next-to-finest level. pyramid, and perhaps some additional levels of the pyramid 

Next, if variable zfar_finest (zfar of the most recently (but not the entire pyramid, since this would involve too 

processed finest-level tile) is not nearer than zfarjl.], no 35 much work). 

propagation can be performed, so the procedure terminates The amount of data that needs to be copied may be very 

at step 1706, Next, step 1708 sets variable zfar to variable modest. For example, if the copied tip includes pyramids. 

zfar_Jinest. zfar, the coarsest 4x4 tile, and the 1 6 4x4 tiles at the adjacent 

Next, if ancestor_flag[K] is FALSE (step 1710), step finer level, a total of 273 z-values need to be copied. In some 

1712 reads the z-values corresponding to the level-K ances- 40 cases, the scene manager 110 can cull a substantial amount 

tor of the current cell from the z-pyramid into z-array[K] of occluded geometry using this relatively small amount of 

using procedure Read Z-Array 1200. If ancestor_flag[K] is occlusion information. 

TRUE at step 1710, control proceeds directly to step 1714. At step 702 of procedure Process Batch of Boxes 700, the 

Step 1714 determines the index "A" of the cell within scene manager 110 uses the tip of the pyramid to perform 

array z-array[K] that is an ancestor of the z-value being 45 conservative culling on occluded bounding boxes using 

propagated. Next, step 1716 sets variable zold to the depth procedure Is Box Occluded by Tip 1900 (FIG. 19). This 

value for cell A in z-array[K] (i.e., z-array[KlA]). culling procedure 1900 is illustrated in FIGS. 18a and 18b, 

Next, step 1718 overwrites z-array[K][A] with the value which show the coordinate frame of model space 1800 (the 

of variable zfar. Next, if K is the coarsest level (step 1720), coordinate frame that the model is represented in), bounding 

step 1722 determines whether zfar is farther than zfarJ^K]. 50 boxes 1802 and 1804, the view frustum 1806 with its far 

If so, zfar is a new zfar value for the entire z-pyramid, and clipping plane 1810, the current zfar value of the z-pyramid 

step 1724 sets variable pyramid_zfar to variable zfar. (i.e., pyramid_zfar) 1812, and the current zfar values for a 

Whether or not step 1722 is executed, the procedure row 1814 of cells within the coarsest NxN tile 1816 of the 

terminates at step 1726. z-pyramid 170, including the zfar value of cell 1820. 

If K is not the coarsest level at step 1720, control proceeds 55 To simplify illustration, the frustum is oriented so that the 

to step 1728, where if Read Z-Array 1200 was executed at viewing axis 1822 is parallel to the page and four faces of 

step 1712, zfarJX] is computed from the values in z-array the frustum are perpendicular to the page. 

[K] (this is a relatively slow procedure, but usually it is not If pyramid__zfar 1812 is nearer than the depth of the far 

required). Next, at step 1730, if zold is not farther than clipping plane 1810, this establishes a nearer value for the 

zfarJK], the procedure terminates at step 1732. <so far clipping plane, so the far clipping plane is reset to this 

Otherwise, step 1734 sets variable zfar equal to the value. In FIG. 18a, resetting the far clipping plane to 

farthest of variables zfar and zfarJK]. Next, step 1736 sets pyramid_zfar 1812 enables rapid culling of box 1802, since 

L equal to K and sets K equal to the level that is adjacent to the depth of the nearest corner of box 1802 (which was 

and coarser than L, and control returns to step 1710. computed at step 614 of procedure Sort Boxes into Layers 

Although procedure Process NxN Tile 1300 updates array 65 600) is farther than pyramid_zfar 1812. 

zfar, while looping over individual cells in an NxN tile, the Now the steps of procedure Is Box Occluded by Tip 1900 

same approach could also be applied if several cells were are described. The procedure is described infra as it applies 
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to box 1804 in FIGS. 18a and 186. Step 1902 determines When it is established that a polygon is visible (at step 

whether the nearest comer of the box is farther than the far 1322 or step 1340), the polygon's record in the FIFO of 

clipping plane. Rendering Records 2008 is output to the z-buffer Tenderer 

If so, step 1912 reports that the box is occluded, and the 140 on connection 2028. Records in the FIFO of Rendering 

procedure terminates at step 1916. If not, control proceeds to 5 Records 2008 that correspond to occluded polygons are 

step 1904, which determines a bounding sphere 1824 for the discarded. 

box 1804, and step 1906 transforms the sphere's center 1826 Now data flow is considered when the culling stage 130 

to perspective space and determines the depth D 1828 of the is operating in v-query mode and determining the visibility 

sphere's nearest point. of bounding boxes with Process Batch of Boxes 700, In this 

Next, step 1908 determines the smallest z-pyramid cell 10 case, the geometric processor 120 outputs tiling records and 

1820 that encloses the sphere 1824 and reads the cell's zfar markers indicating "end of box" and "end of batch." Tiling 

value. If depth D 1828 is farther than zfar (step 1910), step records are buffered in the FIFO of Tiling Records 2006. 

1912 reports that the box is occluded (this is the case with When in v-query mode, the geometric processor 120 does 

box 1804) and the procedure terminates at step 1916. not output rendering records, so none are loaded into the 

Otherwise, step 1914 reports that the box is potentially 15 FIFO of Rendering Records 2008. 

visible and the procedure terminates at step 1916. Flow of tiling records on connections 2012, 2016, 2022, 

Summarizing this culling method, the scene manager 110 and 2026 is the same as when in rendering mode, 

receives the tip of the z-pyramid 170 along with v-query Z-values needed for depth comparisons at step 1308 are 

results on connection 190 and uses these z-values to reset the read from the list of z-arrays 2018 on connection 2024, but 

far clipping plane and perform conservative culling of 20 no z-values are written on this connection. If z-values are 

bounding boxes. The method described supra for culling needed for a tile that is not stored in the list of z-arrays 2018, 

boxes with the tip of the z-pyramid is very efficient because they are obtained from the z-pyramid 170, which involves 

processing a box only requires transforming a single point writing an old tile record (if necessary) and reading a new 

(or none) and making a single depth comparison. tile record on connection 2020. 

The tip of the pyramid is in fact a low-resolution 25 If a visible sample is discovered, the bit in V-Query Status 

z-pyramid, that is, a z-pyramid with lower resolution than Bits 2030 corresponding to the current box is set to visible 

the z-pyramid 170 maintained by the culling stage 130, or if on connection 2032 (step 710); otherwise the bit is set to 

there is no separate culling stage, than the z-pyramid main- occluded (step 712). 

tained by a hierarchical rendering stage. When the visibility of all boxes in the batch has been 

Data Flow within the Culling Stage. 30 established, the V-Query Status Bits 2030 and the tip of the 

FIG. 20 shows a block diagram of data flow within the z-pyramid 170 are sent to the scene manager 110 on the 

culling stage 130. This is a high-level schematic diagram feedback connection 190 (steps 714 and 716). 

that does not include all data and signals that would be Other Ways of Reducing Image -Memory Traffic, 

required in an implementation. The culling stage preferably uses a low-precision 

The input to the culling stage 130 is the processing mode 35 z-pyramid 170 in order to reduce storage requirements and 

2002, either render of v-query, and a list of records for memory traffic. 

transformed polygons 2004 sent by the geometric processor The most straightforward way to implement a low- 

120. First, data flow is described when the culling stage 130 precision z-pyramid is to store each z-value in fewer bits 

is operating in render mode and rendering a list of polygons than the customary precision of between 24 and 32 bits. For 

with Tile Polygon List 800. 40 instance, storing z-values in 8 bits reduces storage require - 

In this case, the geometric processor 120 outputs two ments by a factor of 4 as compared with storing z-values in 

records for each polygon, a tiling record and a rendering 32 bits. 

record, and these records are buffered in the FIFO of Tiling Even greater reductions in the storage requirements of a 

Records 2006 and the FIFO of Rendering Records 2008, z-pyramid used for conservative culling can be achieved 

respectively. 45 with the modifications described below. 

Tile Polygon List 800 processes polygons one by one until Encoding of Depth Values, 

all polygons on the list have been tiled. For each polygon, Storage requirements of the z-pyramid 170 can be 

the Tile Stack 2010 is initialized by copying the next tiling reduced by storing depth information for tiles in a more 

record in the FIFO of Tiling Records 2006 on connection compact form than NxN arrays of z-values. 

2012 (step 1104). The Current Tile register 2014 is loaded 50 According to this method, a finest-level tile is stored as a 

from the Tile Stack 2010 on connection 2016 (step 1108). znear value and an array of offsets from znear, where znear 

When Process NxN Tile 1300 performs occlusion and is the depth of the nearest sample within the tile. Preferably, 

overlap tests (steps 1308 and 1310), edge and plane equa- offsets are stored at relatively low precision (e.g., in 4 bits 

tions (which are part of tiling records) are read from the each) and znear is stored at higher precision (e.g., in 12 bits). 

Current Tile register 2014 on connection 2022, and z-values 55 The record for each finest-level tile consists of an NxN 

are read from the list of z-arrays 2018 on connection 2024. array of oflsets, znear, and a scale factor S that is needed to 

Whenever z-values are needed for a tile that is not stored compute depths from oflsets. If znear is stored in 12 bits, S 

in the list of z-arrays 2018, they are obtained from the in 4 bits, and each offset value in 4 bits, the record for a 4x4 

z-pyramid 170, which involves writing an old tile record (if tile requires 80 bits, which is 5 bits per sample. Z-values in 

necessary) and reading a new tile record on connection 60 tiles that are not at the finest level of the pyramid are stored 

2020, When visible samples are encountered, z-values are in arrays, as usual (for example, as arrays of 8-bit z-values). 

written to the fist of z-arrays 2018 on connection 2024 (step FIG. 21 shows a side view of a finest-level tile in the 

1330). When z-values are propagated, z-values are read z-pyramid, which in three dimensions is a rectangular solid 

from and written to the list of z-arrays 2018 on connection 2100 having a square cross-section. Given the indicated 

2024. 65 direction of view 2102, the right-hand end 2104 of the solid 

When new tiles are created (at step 1344), they are written 2100 is the near clipping plane and the left-hand end 2106 

to the Tile Stack 2010 on connection 2026. of the solid 2100 is the far clipping plane. 



09/26/2003, EAST Version: 1.04.0000 



US 6,480 : 

29 

The four thin horizontal lines 2116 indicate the positions 
of rows of samples within the tile. The two inclined lines, 
2108 and 2110, indicate the positions of two polygons, 
which are oriented perpendicular to the page to simplify 
illustration. 5 

In this instance, the depth of sample A on polygon 2110 
is znear 2112, sample B is not "covered," so its depth is the 
depth of the far clipping plane 2106, and sample C on 
polygon 2108 is the deepest covered sample within the tile. 
The depth of the deepest covered sample within a tile is 10 
called zfar c (in this case, zfar c is the depth 2114 of sample 

To improve effective depth resolution, one offset value is 
reserved to indicate samples that lie at the far clipping plane 
(that is, samples that have never been covered by a polygon). 15 

For example, suppose that offset values are each 4-bit 
values corresponding to integers 0 through 15, and value 15 
is reserved to mean "at the far clipping plane." Then, offset 
values 0 through 14 would be used to represent depths in the 
range znear to zfar c . 20 

In general, this requires scaling by the scale factor S, 
computed with the following formula 6000: S«(FAR- 
NEAR)/(zfar c - znear), where NEAR is the depth of the near 
clipping plane and FAR is the depth of the far clipping plane. 
Once S has been computed, the offset for a covered sample 25 
at depth z is computed with the following encoding formula 
6001: 

off set-(z- znear)/S, 

30 

where offset is rounded to an integer. The inverse decoding 
formula 6002 for computing a z-value from an offset is: 

2*znear+S "offset. 

To simplify evaluation of the encoding and decoding 35 
formulas, scale factor S is rounded to a power of two, which 
enables both multiplication and division by S to be per- 
formed by shifting. As a result, computations of both offsets 
and z- values are only approximate, but computations are 
structured so that depth comparisons are always 40 
conservative, never causing a visible polygon to be culled. 

Given the previous assumptions about using 4-bit offsets, 
in FIG. 21, the offset computed for sample A would be 0 
(because its depth is znear), the offset computed for sample 
C would be 14 (because its depth is zfar,.), the offset 45 
computed for sample D would lie somewhere between 0 and 
14, and the offset for sample B would be 15, since this is the 
value reserved for "at the far clipping plane." 

znear and zfar c can be computed by procedure Process 
NxN Tile 1300 as it loops over the cells within a tile. For 50 
example, to compute znear, step 1302 would initialize 
variable znear to the depth of the far clipping plane and 
following step 1326, variable znear would be updated with 
the depth of the nearest visible sample encountered so far. 

When finest-level tiles in the z-pyramid are encoded, 55 
changes must be made when reading or writing a finest-level 
tile in procedures Tile Convex Polygon 1100 and Read 
Z-Array 1200. When Read Z-Array 1200 reads the encoded 
record of a finest-level tile at step 1206, the z-value of each 
sample is computed from znear, S, and the offset value using 60 
the decoding formula 6002 and written to z-array[L] (where 
L is the finest level). 

When writing the record for a finest-level tile, instead of 
writing z-array[L] at step 1120 of Tile Convex Polygoo 
1100, an encoded tile record is created from z-array[L] and 65 
then written to the z-pyramid 170. The tile record is created 
as follows. 
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First, the exponent of scale factor S is computed by 
computing S with formula 6000, rounding S to a power of 
two, and then determining its exponent (since S is a power 
of 2, it can be stored very compactly as an exponent). 

Then the offset value corresponding to each z-value is 
computed. If S for the tile has not changed since the tile was 
read, the old offset is used for any sample where the polygon 
was not visible. Otherwise, the oflket is computed using the 
encoding formula 6001. 

Now all of the information in a tile record is known, and 
the record is written to the z-pyramid 170. Following step 
1120 in Tile Convex Polygon 1100, propagation of the tile's 
zfar value is performed in the usual way using z-values that 
are not encoded. 

The method described above makes it possible to con- 
struct highly accurate z-values from low-precision offsets 
whenever the depths of covered image samples within a tile 
lie within a narrow range, which is often the case. In the 
worst case when z-values cover nearly the whole range 
between the near and far clipping planes, this method is 
equivalent to representing z-values solely with low- 
precision offset values, compromising z-resolution. In typi- 
cal scenes, however, depth coherence within finest-level 
tiles is quite high on average, resulting in accurate z-values 
and efficient culling in most regions of the screen. 

Even though the finest level of the z-pyramid is not a 
conventional z-buffer when depth values are encoded as 
described above, herein the terms "z-pyramid" and "hierar- 
chical depth buffer" will still be applied to this data struc- 
ture. 

Reducing Storage Requirements with Coverage Masks. 

Another novel way to reduce the storage requirements of 
a z-pyramid used for conservative culling is to maintain a 
coverage mask at each finest-level tile and the zfar value of 
the corresponding samples, which together will be called a 
mask-zfar pair. According to this method, the record for each 
finest-level tile in the z-pyramid consists of the following 
information, which will be called a mask-zfar record 7000 
for a tile. 

Mask- Zfar Tile Record. 

1. zfar value for the whole tile (zfar,) 

2. mask indicating samples within a region of the tile 
(mask,) 

3. zfar value for the region indicated by mask, (zfar^ 
The terms zfar„ mask r and zfar m are defined above. 

Preferably, only tiles at the finest level of the z-pyramid are 
stored in mask-zfar records. At all other levels, tile records 
are arrays of z-values which are maintained by propagation. 
Preferably, individual z-values within these arrays are stored 
at low precision (e.g., in 12 bits) in order to conserve 
storage. 

The advantage of using mask-zfar records is that they 
require very little storage. For example, if zfar, and zfar m are 
each stored in 12 bits, the record for a 4x4-sample tile would 
require only 40 bits, 24 bits for these z-values and 16 bits for 
masks (one bit for each sample). 

This is only 2.5 bits per sample, more than a three-fold 
reduction in storage compared with storing an 8-bit z-value 
for each sample, and more than a twelve-fold reduction in 
storage compared with storing a 32-bit z-value for each 
sample. 

It is not essential to store zfart in mask-zfar records, 
because the identical z-value is also stored in the record for 
the parent tile. Eliminating zfar, from mask-zfar records 
would reduce storage requirements to 1.75 bits per sample 
for a 4x4 tile, given the assumptions stated above. However, 
this approach requires that the parent tile's records be read 
more often when finest-level tiles are processed, which is a 
disadvantage. 
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FIG. 22 and FIG. 23 show an example illustrating how 
zfar, advances when polygons that cover a tile are processed. 
FIG. 22 shows a 4x4 tile 2200 at the finest level of the 
z-pyramid having uniformly spaced samples 2202 that are 
covered by two triangles, labeled Q and R. s 

FIG. 23 shows a side view of the tile 2200, which in three 
dimensions is a rectangular solid 2300 having a square 
cross-section. Given the indicated direction of view 2302, 
the right-hand end 2304 of the solid 2300 is the near clipping 
plane and the left-hand end 2306 of the solid 2300 is the far 10 
clipping plane. 

The four thin horizontal lines 2308 indicate the positions 
of rows of samples within the tile. The two inclined lines 
indicate the positions of triangles Q and R, which are 
oriented perpendicular to the page to simplify the illustra- 15 
tion. 

When the z-pyramid is initialized at the beginning of a 
frame, mask-zfar records in the z-pyramid are initialized as 
follows: zfar, is set to the depth of the far clipping plane and 
mask f is cleared to all zeros, meaning that no samples are 20 
covered. Thus, before processing any polygons at tile 2200, 
zfar, is the depth of the far clipping plane 2306 and mask, is 
all zeros. 

Suppose that Q is the first polygon processed at tile 2200. 
When Q is processed, the bits in mask, are set that corre- is 
spond to the samples covered by Q (these are the samples 
within the crosshatched region 2204 in FIG. 22b) and zfar m 
is set to the depth of the farthest sample covered by Q, 
labeled zfar e in FIG. 23. 

Later, when R is processed, its mask (indicated by the 30 
crosshatched region 2206 in FIG. 22c) and its zfar value 
within the tile (labeled zfarR in FIG. 23) are computed. Since 
R's mask 2206 and mask^ (in this case, Q's mask 2202) 
collectively cover the tile 2200, a nearer value has been 
established for zf ar„ in this case zfar^, so zfar r is set to zfar^. 35 

This illustrates how zfar, advances when one or more 
polygons covering a tile are processed, which enables con- 
servative culling of occluded polygons that are encountered 
later. 

Next, the general method is described for updating a 40 
mask-zfar record when a polygon is processed. Cases that 
need to be considered are schematically illustrated in FIG 
24. 

FIG. 24 shows a side view of a 4x4 tile, which in three 
dimensions is a rectangular solid 2400 having a square 45 
cross-section. Given the indicated direction of view 2402, 
the right-hand end 2404 of the solid 2400 is the near clipping 
plane and the left-hand end 2406 of the solid 2400 is the far 
clipping plane. 

The four thin horizontal lines 2408 indicate the positions 50 
of rows of samples within the tile. The bold vertical lines at 
depths zfar, and zfar m represent the occlusion information 
stored in the tile's mask-zfar record. The bold line at depth 
zfar, covers the whole tile and the bold line at depth zfar TO 
indicates the samples covered by mask,. 55 

The numeral 2410 identifies a polygon that is oriented 
perpendicular to the page. 

The dashed vertical lines labeled P 1} P 2 , P 3 , P 4 , and P s 
represent possible positions of the next polygon to be 
processed, indicating the region of the tile covered by visible 60 
samples on the polygon and the polygon's zfar value in 
relation to zfar m and zfar,. Here, the "polygon's zfar value" 
is the farthest z of its potentially visible samples, so this 
z-value must be nearer than zfar,. 

Although coverage is only depicted schematically, the 65 
basic cases are distinguished: the polygon covers the whole 
tile (case P 3 ), the polygon covers the tile in combination 



with mask, (cases ? x and P 4 ), and the polygon does not cover 
the tile in combination with mask, (cases P 2 and P 5 ). 

If each sample on a polygon lies behind zfar, or is covered 
by mask, and lies behind zfar„, the polygon is occluded 
within the tile. For example, polygon 2410 in FIG. 24 
(oriented perpendicular to the page for convenience), is 
occluded because sample 2412 is inside mask, and behind 
zfar m and sample 2414 is behind zfar,. 

When using mask-zfar records in the z-pyramid, changes 
must be made when reading or writing a finest-level tile in 
procedures Tile Convex Polygon 1100 and Read Z-Array 
1200. When step 1206 of Read Z-Array 1200 reads the 
mask-zfar record of a finest-level tile (which includes zfar, 
mask„ and zfar,„), the z-value of each sample is written to 
z-array[L] (where L is the finest level). The z-value of each 
sample covered by mask, is zf ar m and the z-value of all other 
samples is zfar,. 

When writing the record for a finest-level tile, instead of 
writing z-array[L] at step 1120 of Tile Convex Polygon 
1100, a new mask-zfar record is created from z-array[L] 
with procedure Update Mask-Zfar Record 2500 and this 
record is written to the z-pyramid. 

If all samples on the polygon are occluded (as with 
polygon 2410, for example), step 1120 is not executed, so 
neither is Update Mask-Zfar Record 2500. 

Update Mask-Zfar Record 2500 (FIG. 25) receives as 
input the values in the old mask-zfar record (i.e., zfar„ zfar m , 
and mask,), the mask for samples where the polygon is 
visible within the tile (call this maskp), and the zfar value of 
these samples (call this zfar^). maskp and zfar^ can be 
computed efficiently within Process NxN Tile 1300 as it 
loops over the samples in a tile. 

At step 2502, if maskp covers the whole tile (i.e., it is all 
ones, which means that the polygon is visible at all samples, 
as for case P 3 in FIG. 24), at step 2504 zfar, is set to zfar 
and maskp is cleared to all zeros, and the procedure termi- 
nates at step 2506. Otherwise, control proceeds to step 2508 
where if maskj maskp is all orjes (where 'f ' is the logical 
"or" operation), the polygon and mask, collectively cover 
the tile, and in this case, control proceeds to step 2510. 

At step 2510, if zfarp is nearer than zfar m (e.g. P 4 in FIG. 
24), a nearer zfar value has been established and step 2512 
sets zfar, to zfar m , mask, to maskp, and zfar m to zfarp, 
followed by termination at step 2514. If zfar p is not nearer 
than zfar m at step 2510 (e.g. P a in FIG. 24), step 2516 sets 
zfar, to zfa^, followed by termination at step 2514. 

If maskj maskp is not all ones at step 2508, the polygon 
and mask, do not collectively cover the tile, and the occlu- 
sion information for the polygon and mask, are combined as 
follows. Step 2518 sets mask, to maskj maskp (where "|" is 
the logical "or" operation). Next, at step 2520, if mask, is all 
zeros, control proceeds to step 2524, which sets zfar m to 
zfar • followed by termination of the procedure at step 2526. 

if mask, is not all zeros at step 2520, control proceeds to 
step 2522, where, if zfar^, is farther than zfar m , control 
proceeds to step 2524. For example, with P 2 in FIG. 24, zfar^ 
is farther than zfar m7 so step 2524 would be executed. 

If zfar^ is not farther than zfar m at step 2522 (as is the case 
with P 3 in FIG. 24), the procedure terminates at step 2526, 

Some of the operations performed by Update Mask-Zfar 
Record 2500 can be done in parallel. 

In summary, the advantage of using mask-zfar pairs to 
store occlusion information in a z-pyramid used for conser- 
vative culling is that it requires very little storage (for 
example, 2.5 bits per image sample). The disadvantage of 
this approach is that maintaining occlusion information is 
more complicated and culling efficiency may not be as high. 
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To illustrate the savings in storage that can be achieved, 
when the finest level of a z-pyramid having 4x4 decimation 
is stored as mask-zfar tile records, each record including two 
12-bit z-values and one 16 -bit coverage mask, and the other 
levels of the z-pyramid are stored as arrays of 12-bit s 
z-values, the z-pyramid requires approximately 3.30 bits of 
storage per sample in the finest Level. In this case, the total 
bits of storage in a 32-bit z-buffer having the same resolution 
is approximately ten times greater than the total bits of 
storage in the z-pyramid. io 

Even though the finest level of the z-pyramid is not a 
conventional z-buffer when mask-zfar records are 
employed, herein the terms "z-pyramid" and "hierarchical 
depth buffer" will still be applied to this data structure. 

The prior art includes the A-buffer visible-surface algo- 15 
rithm that maintains pixel records that include coverage 
masks and z-values. At individual pixels, the A-buffer algo- 



To illustrate the savings in storage that can be achieved 
with a low-resolution z-pyramid, consider a graphics system 
with a 1024 by 1024 z-buffer in the rendering stage and a 
256 by 256 z-pyramid in the culling stage. Assuming 32-bit 
z-values in the z-buffer, 12-bit z-values in the z-pyramid, 
and 4x4 decimation from level to level of the z-pyramid, the 
total bits of storage in the z-buffer would be approximately 
40 times greater than the total bits of storage in the 
z-pyramid. 

Using a low-resolution z-pyramid requires only minor 
changes to the rendering algorithm that has already been 
described for the graphics system 100 of FIG. 1. In fact, it 
is only necessary to change procedure Process NxN Tile 
1300. 

At step 1324, control proceeds to step 1334, which 
determines whether the polygon completely "covers" the 
cell. This occurs only if the cell is completely inside all of 
the polygon's edges. Whether a cell lies completely inside 
an edge can be determined with the edge-cell test described 



rithm maintains a linked list of visible polygon fragments, 
the record for each fragment including a coverage mask 

indicating the image samples covered by the fragment, color 20 in connection with step 1310, except that instead of substi 

and opacity values, and znear and zfar values, each stored in tuting the cell's corner that is farthest in the "inside direc 

floating-point format. This record format is designed to lion" into the edge equation, the opposite comer is substi- 

resolve color and visibility at each image sample, enabling tuted. 

high-quality antialiasing of pixel values. If the polygon does not completely cover the cell, control 

Although the A-buffer record format could be employed 25 returns to step 1304. Otherwise, step 1336 computes the zfar 

at finest-level tiles in the z-pyramid, its variable-length, value of the plane of the polygon within the cell, which is 

linked-list format greatly complicates processing and done as previously described for computing the plane's 

requires dynamic memory allocation. By comparison, the znear value at step 1308, but instead of substituting the 

novel method of performing conservative occlusion culling "nearest corner" of the cell into the plane equation, the 

using a single coverage mask at a tile is much simpler and 30 opposite corner is substituted, since this is where the plane 

much easier to implement in hardware. is farthest within the cell 

Culling with a Low-Resolution Z-Pyramid. In FIG. 14, for example, the corner 1408 is the "nearest 

As previously mentioned, a separate culling stage 130 in corner" of cell 1402, meaning that the plane of polygon 1400 

the graphics system 100 enables conservative culling with a is nearest to the observer at that corner. Therefore, the plane 

low-precision z-pyramid, that is, a z-pyramid having the 35 of polygon 1400 is farthest from the observer at the opposite 

1 . ^ . . . , , corner 1410, so to establish the zfar value for the plane of 



same resolution as the z-buffer, but in which z-values are 
stored at low precision, for example, as 8-bit or 12-bit 
values. Alternatively, the culling stage 130 can employ a 
low-resolution z-pyramid, that is, a z-pyramid having lower 
resolution than the z-buffer. As previously mentioned, the 40 
resolution of a z-pyramid is the resolution of its finest level. 

For example, a single zfar value could be maintained in 
the finest level of the z-pyramid for each 4x4 tile of image 
samples in the output image 150, As applied to the 64x64 



polygon 1400 within cell 1402, the x and y coordinates of 
this corner 1410 are substituted into the plane equation, 
which has the form z-Ax+By+C. 

If at step 1336 the plane's zfar value is nearer than the 
corresponding value for the current cell in z-array[F] (where 
F is the index of the finest level), control proceeds to step 
1326, which sets changed to TRUE. Then step 1328 updates 
zfar_finest, overwriting zfar_finest with the plane's zfar 



image raster of FIG. 2 (only partially shown), level 230 45 value, if the plane's zfar value is farther than the current 



50 



55 



would be the finest level of the low-resolution z-pyramid, 
and each cell within this level would represent a conserva- 
tive zfar value for the corresponding 4x4 tile of image 
samples in the image raster. For instance, cell 218 would 
contain a conservative zfar value for the image samples in 
4x4 tile 220. 

Definitive visibility tests cannot be performed using a 
low-resolution z-pyramid, but conservative culling can be 
performed. The disadvantage of a low-resolution z-pyramid 
is that it has lower culling efficiency than a standard 
z-pyramid, and this increases the workload on the z-buffer 
Tenderer 140. 

However, a low-resolution z-pyramid has the advantage 
of requiring only a fraction of the storage, and storage 
requirements can be further reduced by storing zfar values at 60 
low-precision (e.g., 12 bits per value). In cases where the 
reduction in storage requirements enables the z-pyramid to 
be stored entirely on-chip, the resulting acceleration of 
memory access can improve performance substantially. In 
short, using a low- resolution z-pyramid impairs culling 65 
efficiency but reduces storage requirements and can increase 
culling speed in some cases. 



value of zfar_finest. Next, step 1330 overwrites the value 
for the current cell in z-array[F] with the plane's zfar value, 
and control returns to step 1304. 

The optional shading step 1332 is not compatible with 
using a low-resolution z-pyramid. At step 1336, if the 
plane's zfar value is not nearer than the corresponding value 
in z-arrayfF], control returns directly to step 1304. 

FIG. 26 shows a side view of a cell in the z-pyramid, 
which in three dimensions is a rectangular solid 2600 having 
a square cross-section. Given the indicated direction of view 
2602, the right-hand end 2604 of the solid 2600 is the near 
clipping plane and the left-hand end 2606 of the solid 2600 
is the far clipping plane. The bold vertical line indicates the 
current z-value 2608 stored in the z-pyramid cell. 

The three inclined lines, 2610, 2620, and 2630, indicate 
the positions of three polygons, each covering the cell and 
each oriented perpendicular to the page to simplify illustra- 
tion. For each polygon, its znear and zfar values within the 
cell are shown by dashed lines. 

Now, the procedure Process NxN Tile 1300 processes 
these polygons within this cell, assuming a low-resolution 
z-pyramid. 
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Polygon 2610 would be determined to be occluded within 
the cell at step 1308, because its znear value 2612 is farther 
than the z-pyramid value 2608. 

Polygon 2620 would be determined to be visible because 
its znear value 2622 is nearer than the current z-pyramid 5 
value 2608, but the z-pyramid would not be overwritten with 
the polygon's zfar value 2624 because the polygon's zfar 
value 2624 is farther than the current z-pyramid value 2608. 

Polygon 2630 would be determined to be visible because 
its znear value 2632 is nearer than the current z-pyramid 10 
value 2608, and the z-pyramid would be overwritten with 
the polygon's zfar value 2634 because the polygon's zfar 
value 2634 is nearer than the current z-pyramid value 2608. 

Now an alternative way of updating a low-resolution 
z-pyramid in the culling stage 130 is described. When the 15 
z-buffer Tenderer 140 encounters visible depth samples on a 
polygon, they are copied to the culling stage 130 and 
propagated through the z-pyramid 170. 

This method requires a connection 185 for copying 
z-values from the z-buffer Tenderer 140 to the culling stage 20 
130, which is drawn in a dashed arrow in FIG. 1 to indicate 
that this is just an option. If z-values in the z-pyramid 170 
are stored at lower precision than z-values in the z-buffer 
180, z-values may be converted to low-precision values 
before they are copied. When the culling stage 130 receives 25 
new depth samples on connection 185, they are propagated 
through the z-pyramid using the traditional propagation 
algorithm. 

When this method is employed, it is not necessary to 
update the z-pyramid during tiling of polygons by the 30 
culling stage, which simplifies the tiling algorithm consid- 
erably. In fact, in procedures Tile Convex Polygon 1100 and 
Process NxN Hie 1300, only the steps performed in v-query 
mode are necessary, except for outputting rendering records 
when visible polygons are encountered. 35 
Varying Z Precision within a Z-Pyramid. 

In the description of procedure Process NxN Tile 1300, 
for the preferred embodiment of the invention, the culling 
and rendering stages are separate and have their own depth 
buffers, but it is possible to combine the two stages in a 40 
single "hierarchical renderer" having a single z-pyramid 
used for both culling and rendering. 

In this case, the finest level of the z-pyramid is a z-buffer 
in which z-values are stored at full precision (e.g., in 32 bits 
per z-value) so that visibility can be established definitively 45 
at each image sample. At other pyramid levels, however, it 
is not necessary to store z-values at full precision, since 
culling at those levels is conservative. 

Thus, at all but the finest pyramid level, it makes sense to 
store z-values at low precision (e.g., in 12 bits) in order to 50 
conserve storage and memory bandwidth and improve cach- 
ing efficiency. Frequently, only z-values at coarse levels of 
the pyramid need to accessed to determine that a bounding 
box or primitive is occluded, so caching the coarsest levels 
of the pyramid can accelerate culling significantly. Using 55 
low-precision z-values enables more values to be stored in 
a cache of a given size, thereby accelerating culling. 

When low-precision z-values are employed in a pyramid 
as described above, the average precision of z-values in the 
z-buffer is higher than the average precision of z-values in 60 
the entire z-pyramid. For example, for a z-pyramid having 
4x4 decimation from level to level and a 1024 by 1024 
z-buffer in which z-values are stored at 32 bits of precision, 
and in which z-values in the other pyramid levels are stored 
at 12 bits of precision, then the average z-precision in the 65 
z-buffer is 32 bits per z-value and average z-precision in the 
entire z-pyramid is approximately 30.9 bits per z-value. 



Exploiting Frame Coherence. 

As described supra, the efficiency of hierarchical 
z-buffering with box culling is highly sensitive to the order 
in which boxes are traversed, with traversal in near-to-far 
occlusion order achieving maximal efficiency. Render 
Frames with Box Culling 500 achieves favorable traversal 
order by explicitly sorting boxes into "layers" every frame. 

Another method for achieving efficient traversal order, 
which is described next, is based on the principle that 
bounding boxes that were visible in the last frame are likely 
to be visible in the current frame and should, therefore, be 
processed first. 

This principle underlies the procedure, Render Frames 
Using Coherence 2700 (FIG. 27), which works as follows. 
The scene manager 110 maintains four lists of box records: 

1. boxes that were visible last frame (visible-box list 1); 

2. boxes that were not visible last frame (hidden-box list 

i); 

3. boxes that are visible in the current frame (visible -box 
list 2); and 

4. boxes that are not visible in the current frame (hidden- 
box list 2). 

"Hidden" boxes include both occluded and off-screen 
boxes. In step 2702, the scene manager 110 organizes all 
scene polygons into polyhedral bounding boxes, each con- 
taining some manageable number of polygons (e.g., between 
50 and 100). In step 2704, the scene manager 110 clears 
visible-box list 1 and hidden-box list 1, and appends all 
boxes in the scene to hidden-box list 1. 

Now the system has been initialized and is ready to render 
sequential frames. First, step 2706 initializes the output 
image 150, z-pyramid 170, and z-buffer 180 (z-values are 
initialized to the depth of the far clipping plane). 

Next, step 2708 reads boxes in first-to-last order from 
visible-box list 1 and processes each box, as follows. First, 
it tests the box to see if it is outside the view frustum, and 
if the box is outside, its record in the list is marked 
off-screen. 

If the box is not outside, the polygons on its polygon list 
are rendered with procedure Render Polygon List 300. When 
the first frame is rendered, visible -box list 1 is null, so step 
2708 is a null operation. 

Next, step 2710 reads boxes in first-to-last order from 
hidden-box list 1 and processes each box as follows. First, 
it tests the box to see if it is outside the view frustum, using 
the method described at step 608 of procedure 600, and if the 
box is outside, its record in the list is marked off-screen. 

If the box is not outside and it intersects the "near face" 
of the view frustum, its record in the list is marked visible 
and its polygons are rendered with Render Polygon List 300. 
If the box is not outside and it does not intersect the near 
face, it is tested for occlusion with respect to the tip of the 
z-pyramid with Is Box Occluded by Tip 1900, and if it is 
occluded, its record in the list is marked occluded. 

Otherwise, the box is batched together with other boxes 
(neighbors on hidden-box list 1) and processed with Process 
Batch of Boxes 700 operating in render mode. If the box is 
visible, this procedure 700 renders the box's polygon list. 
Otherwise, the box's record in the list is marked occluded. 

Now all polygons in visible boxes have been rendered 
into the output image 150, which is displayed at step 2712. 
The remaining task before moving on to the next frame is to 
establish which boxes are visible with respect to the 
z-pyramid. 

First, step 2714 clears visible -box list 2 and hidden -box 
list 2. Next, step 2716 reads boxes in first-to-last order from 
visible -box list 1 and processes each box as follows. If the 
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box was marked off-screen at step 2708, it is appended to list of primitives that they contain. The record for a bounding 
hidden-box list 2. If the box was not marked off -screen and box includes records for its faces and a pointer to the list of 
it intersects the "near face" of the view frustum, the box is primitives that the box contains. Box records are retained in 
appended to visible-box list 2. scene-manager memory and the lists of polygons associated 
If the box was not marked off-screen and it does not 5 with bounding boxes are swapped into and out of scene- 
intersect the near face, it is tested for occlusion with respect manager memory as necessary. 

to the tip of the z-pyramid with Is Box Occluded by Tip The advantage of organizing the scene model in this way 

1900, and if it is occluded, the box is appended to hidden- is that the only time that paging of the scene model is 

box list 2. Otherwise, the box is batched together with other required when rendering a frame is when a bounding box is 

boxes (neighbors on visible-box list 1) and processed with 10 visible and its polygon list is currently swapped out. This 

Process Batch of Boxes 700 operating in v-query mode in occurs, for example, at step 720 of procedure Process Batch 

order to determine its visibility. If the box is visible, it is of Boxes 700, if the polygon list associated with a visible 

appended to visible-box list 2, and if it is occluded, it is bounding box is not already present in scene -manager 

appended to hidden-box list 2. memory, in which case the polygon list must be copied into 

Next, step 2718 reads boxes in first-to-last order from 15 scene -manager memory before the scene manager 110 can 

hidden-box list 1 and processes each box as follows. If the initiate rendering of the polygon list, 

box was marked off-screen or occluded at step 2710, it is Although the approach just described can reduce paging 

appended to hidden-box list 2. If the box was marked visible of the scene model, at some frames a large number of 

at step 2710, it is appended to visible-box list 2. bounding boxes can come into view, and when this occurs, 

Otherwise, the box is batched together with other boxes 20 the time it takes to copy swapped-out lists of polygons into 
(neighbors on hidden-box list 1) and processed with Process scene -manager memory can delay rendering of the frame. 
Batch of Boxes 700 operating in v-query mode in order to The "look- ahead" method employed herein to reduce such 
determine its visibility. If the box is visible, it is appended delays is to anticipate which bounding boxes are likely to 
to visible -box list 2, and if it is occluded, it is appended to come into view and read their polygon lists into scene- 
hidden-box list 2. 25 manager memory, if necessary, so they will be available 

Next, step 2720 renames hidden-box list 2 to hidden-box when needed. This approach enables delays caused by 

list 1 and renames visible-box list 2 to visible-box list 1. paging of the model to be distributed over a sequence of 

Then, step 2722 updates the bounds of boxes containing frames, resulting in smoother animation, 

moving polygons (if any), and control returns to step 2706 According to this method, first it is estimated where the 

to begin the next frame. 30 view frustum will be after the next few frames have been 

When there is a high degree of frame coherence, as is rendered. This estimated frustum will be called the look- 
usually the case with animation, after rendering the first ahead frustum. 

frame, the algorithm just described approaches the efficiency Then, as the next few frames are being rendered, a 

of near-to-far traversal while avoiding the trouble and look- ahead frame corresponding to the look-ahead frustum 

expense of performing explicit depth sorting or maintaining 35 is created using a procedure that is similar to rendering an 

the scene model in a spatial hierarchy. Efficient traversal ordinary frame, except that no output image is produced, 

order results from processing boxes first that were visible in Rather, processing of primitives stops after they are tiled into 

the preceding frame (i.e., the boxes on visible-box list 1). a z-pyramid, which is separate from the z-pyramid used to 

In addition, the order of boxes on the lists is the order in render ordinary frames and which will be called the look- 

which their visibility was established, which is often corre- 40 ahead z-pyramid. 

lated with occlusion order, particularly if the viewpoint is When tiling of a look- ahead frame has been completed, all 

moving forward. Consequently, first-to-last traversal of lists primitives which are visible in that frame have been paged 

improves the culling efficiency of procedure Render Frames into scene-manager memory and will be available if they are 

Using Coherence 2700. needed when rendering ordinary frames. 

A similar strategy for exploiting frame coherence has 45 To support creation of look-ahead frames in the graphics 
been employed to accelerate z-buffering of models orga- system of FIG. 1, the culling stage 130 includes a look- ahead 
nized in an octree when the z-pyramid is maintained in z-pyramid 195 (shown in dashed lines to indicate that this is 
software and cannot be accessed quickly by the polygon- just an option) and frame-generation procedures are modi- 
tiling hardware. fied so that a look-ahead frame can be generated gradually 
Tiling Look-Ahead Frames to Reduce Latency so while one or more ordinary frames are being rendered. 

When rendering complex scenes in real time, the amount Look-ahead frames are created with procedure Create 

of storage needed for a scene model may exceed the capacity Look-Ahead Frame 2900, shown in FIG. 29. This procedure 

of memory that is directly accessible from the scene man- is simitar to rendering an ordinary frame with box culling, 

ager 110, called scene-manager memory. In this case, it may except that primitives are not passed on to the z-buffer 

be necessary during the rendering of a frame to read part of 55 renderer 140 after they are tiled into the look-ahead 

the scene model from another storage device (e.g., a disk), z-pyramid 195. This procedure 2900 is executed a little at a 

which causes delay. Such copying of scene-model data into time, as the graphics system renders ordinary frames, 

scene-manager memory from another storage device will be Procedure Create Look-Ahead Frame 2900 begins with 

referred to as paging the scene model. step 2902, which clears the look-ahead z-pyramid 195 to the 

Paging the scene model can be controlled with standard 60 far clipping plane, 

virtual-memory techniques, "swapping out" data that has not Next, step 2904 estimates where the view frustum will be 

been recently accessed, when necessary, and "swapping in" after some small amount of time, for example, where the 

data that is needed. view frustum will be after another twenty frames have been 

When rendering scene models that are too large to fit in rendered. This look-ahead frustum is determined by extrapo- 

scene-manager memory, preferably, frames are rendered 65 lating the position of the viewpoint based on the position of 

with procedure Render Frames with Box Culling 500 and the viewpoint in preceding frames, extrapolating the direc- 

records for bounding boxes are stored separately from the tion of view based on the direction of view in preceding 
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frames, and constructing a frustum from the extrapolated 
viewpoint and direction of view. Preferably, look-ahead 
frames are created with a wider view angle than ordinary 
frames so that more of the scene will be visible. 

Next, procedure Sort Boxes into Layers 600, which has 5 
already been described, sorts the scene model's bounding 
boxes into layer lists to facilitate their traversal in approxi- 
mately near-to-far order within the look-ahead frustum. This 
procedure also creates a near-box list containing the boxes 
that intersect the near face of the look- ahead frustum. To 10 
distinguish these lists from the lists used when rendering 
ordinary frames, they will be called the look-ahead layer 
lists and the look-ahead near-box list. 

Next, step 2906 processes the polygon lists associated 
with the bounding boxes on the look-ahead near-box list. 15 
First, any of these polygon lists which are not already 
present in scene -manager memory are copied into scene - 
manager memory. Then, each polygon list is tiled into the 
look-ahead z-pyramid 195 using a modified version of 
procedure Render Polygon list 300 which operates as 20 
previously described, except that the procedure and its 
subprocedures access the look-ahead z-pyramid 195 (instead 
of the other z-pyramid 170), procedure Transform & Set Up 
Polygon 900 does not create or output rendering records, 
procedure Process NxN Tile 1300 does not output polygons 25 
to the z-buffer Tenderer 140, and step 306 of Render Polygon 
List 300 is omitted. 

Next, step 2908 processes the look-ahead layer lists using 
a modified version of procedure Process Batch of Boxes 
700, This procedure 700 operates as previously described, 30 
except that it and its subprocedures access the look-ahead 
z-pyramid 195 (instead of the other z-pyramid 170) and 
procedure Render Polygon List 300 (executed at step 720) is 
modified as described above. 

To enable step 702 of procedure Process Batch of Boxes 35 
700 to cull bounding boxes that are occluded by the look- 
ahead z-pyramid 195, the culling stage copies the tip of the 
look-ahead z-pyramid 195 to the scene manager 110 at step 
716 of procedure Process Batch of Boxes 700. The scene 
manager 110 stores this occlusion data separately from the 40 
tip of the other z-pyramid 170, 

At step 720 of procedure Process Batch of Boxes 700, if 
a polygon list is not already present in scene-manager 
memory, it must be copied into scene-manager memory 
prior to tiling with procedure Render Polygon List 300. 45 

Following step 2908, procedure Create Look-Ahead 
Frame 2900 terminates at step 2910, and work begins on the 
next look-ahead frame. When a look-head frame is 
completed, all polygons which are visible in that frame have 
been copied into scene-manager memory and will be avail- 50 
able if they are needed when rendering an ordinary frame. 

Execution of procedure Create Look-Ahead Frame 2900 
is interleaved with execution of steps 504 through 510 of 
procedure Render Frames with Box Culling 500 (which 
renders ordinary frames), with the scene manager 110 con- 55 
trolling switching from one procedure to the other. 

Preferably, work on look-ahead frames is done at times 
when the components that it requires are not being used by 
Render Frames with Box Culling 500. For example, when 
Process Batch of Boxes 700 is rendering an ordinary frame, 60 
after a batch of bounding boxes is processed by the geo- 
metric processor 120 and the culling stage 130, there is a 
delay before the associated polygon lists are sent through the 
system, since it takes time to report the visibility of boxes. 
During this delay, a batch of boxes for the look-ahead frame 65 
can be processed by the geometric processor 120 and the 
culling stage 130. 
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Also, if processing of an ordinary frame is completed in 
less than the allotted frame time (e.g., in less than one 
thirtieth of a second), work can be performed on a look- 
ahead frame. 

Preferably, the resolution of the look-ahead z-pyramid 
195 is lower than the resolution of ordinary frames in order 
to reduce storage requirements, computation, and memory 
traffic. For example, the look-ahead z-pyramid 195 could 
have a resolution of 256x256 samples. 

Preferably, even when a low-resolution look-ahead 
z-pyramid 195 is employed, the "ordinary" tiling algorithm 
is employed within procedure Process NxN Tile 1300, 
where control passes from step 1316 to step 1326, rather 
than step 1334 (step 1322 is skipped when tiling a look- 
ahead frame). In other words, steps 1334 and 1336 are only 
executed when tiling an ordinary frame with a lower- 
solution z-pyramid, not when tiling a look- ahead frame with 
a low-resolution z-pyramid. 

Preferably, the look-ahead z-pyramid 195 is low-precision 
in addition to being low-resolution, in order to reduce 
storage requirements and memory traffic. For example, each 
z-value can be stored as a 12-bit value. Storage requirements 
can be further reduced by storing finest-level NxN tiles in 
the look-ahead z-pyramid 195 as mask-zfar pairs. 
Hierarchical Z-Buffering with Non-Conservative Culling 

Even with the efficiency of hierarchical z-buffering, at 
some level of complexity it may not be possible to render a 
scene within the desired frame time. When this occurs, 
accuracy can be traded off for speed by culling objects that 
may be slightly visible, that is, by performing non- 
conservative occlusion culling. Although this can noticeably 
impair image quality, in some cases this is acceptable for 
faster frame generation. 

The speed versus accuracy tradeoff is controlled as fol- 
lows. The error limit is defined as the maximum number of 
tiling errors that are permitted within a finest- level tile of the 
z-pyramid when tiling a particular polygon. A tiling error 
consists of failing to overwrite an image sample where a 
polygon is visible. 

Using an error limit E permits non-conservative culling to 
be performed with one modification to the basic algorithm 
for hierarchical tiling. When propagating depth values 
through the z-pyramid, at each finest-level tile, instead of 
propagating the farthest z-value to its parent tile, the z-value 
of rank E is propagated, where the farthest z-value has rank 

0 and the nearest z-value has rank N^-l. 

Thus, when E is 0 the farthest z is propagated, when E is 

1 the next-to-the-farthest z is propagated, when E is 2 the 
next-to-the-next-to-the-farthest z is propagated, and so forth. 
When propagating at other levels of the pyramid (i.e., except 
when propagating from the finest level to the next-to-the- 
finest level), the farthest z value in the child tile is 
propagated, as in a traditional z-pyramid. Using this propa- 
gation procedure, except at the finest level, each z-value in 
the z-pyramid is the farthest rank-E z-value for any finest- 
level tile in the corresponding region of the screen. It follows 
that the occlusion test performed at step 1308 of procedure 
Process NxN Tile 1300 will automatically cull a polygon in 
any region of the screen where it is potentially visible at E 
or fewer image samples within any finest-level tile. 

This method avoids some of the subdivision required to 
definitively establish the visibility of polygons or portions of 
polygons that are potentially visible at only a small number 
of image samples, thereby reducing both memory traffic and 
computation. Moreover, this advantage is compounded 
when culling bounding boxes, since culling of a "slightly 
visible" box saves the work required to process all polygons 
inside it. 
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Each polygon which is potentially visible at more than E 
image samples within a finest- level tile is processed in the 
usual way, so all of its visible image samples within these 
tiles are written. 

This method of non-conservative culling requires the 5 
following modifications to procedure Process NxN Tile 
1300, assuming an error limit of E. 

First, instead of maintaining the farthest of the existing 
z-values for a finest-level tile in variable zfarJF] (where F 
is the index of the finest pyramid level), the z-value of rank 10 
E among the existing z-values for that tile is maintained. For 
example, if E is one, after looping over a finest-level tile in 
procedure Process NxN Tile 1300, variable zfarJT] contains 
the next-to-the-farthest z-value of the z-values that were 
originally stored for that tile. This modification requires 15 
changing procedure Update zfar^ 1600 when variable L is 
the index of the finest level. Second, instead of maintaining 
the farthest z-value encountered so far for the tile being 
processed in variable zfar_finest, the z-value of rank E 
among those z-values is maintained in zfar__finest. For 20 
example, if E is one, after looping over a finest-level tile in 
procedure Process NxN Tile 1300, variable zfar finest would 
contain the next-to-the-farthest z-value in z-array[F], where 
F is the index of the finest pyramid level. 

Given these two modifications, procedure Propagate 
Z- Values 1700 propagates the correct z-values through the 
z-pyramid. One way of thinking of this method for non- 
conservative occlusion culling is that the error limit provides 
a convenient, predictable "quality knob" that controls the 
speed versus quality tradeoff. When the error limit is zero, 
the method performs standard hierarchical z-buffering and it 
produces a standard image that is free of visibility errors. 
Otherwise, the higher the error limit, the faster the frame rate 
but the poorer the image quality. 

When it is important to maintain a particular frame rate, 
the error limit can be adjusted accordingly, either by the user 
or by the rendering program, either at the beginning of a 
frame or during frame generation. 

The method can be applied whether the image is point 
sampled or oversampled, so the speed versus quality spec- 40 
trum ranges from relatively fast generation of point-sampled 
images with numerous visibility errors to relatively slow 
generation of accurately antialiased images that are free of 
visibility errors. 

One shortcoming of this method of non-conservative 45 
culling is that it is possible that up to E image samples may 
never be tiled within a finest -level tile, even though they are 
covered by polygons that have been processed. This behav- 
ior can be avoided by adding an additional propagation rule: 
always propagate the farthest z-value until all image samples so 
within a finest-level tile have been covered. 

Other simple modifications to propagation rules may also 
improve image quality. For example, to make errors less 
noticeable propagation rules could be structured to avoid 
errors at adjacent image samples. 55 

If multiple depth values are maintained corresponding to 
multiple error limits in the z-pyramid, different error limits 
can be selected depending on circumstances. For example, a 
higher error limit could be used when tiling bounding boxes 
than when tiling primitives, since culling a bounding box 60 
can save a lot of work. This approach does not require any 
changes to the finest level of the z-pyramid, but it requires 
propagating and storing multiple z-values for each cell at the 
coarser levels of the z-pyramid. 

For example, if two z-values are maintained for each child 65 
tile at cells in levels of the z-pyramid that are coarser than 
the finest level, the farthest z-value and the next-to-the- 
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farthest z-value within the corresponding region of the 
screen, then the farthest z-values could be applied to culling 
primitives and the next-to-the-farthest z-values could be 
applied to culling bounding boxes. 

Summarizing the changes to the z-pyramid that are 
required when performing non-conservative culling for an 
error limit of E, the same information is stored at the finest 
level as with ordinary conservative culling, but at all coarser 
levels, instead of storing the farthest z-value within the 
corresponding region of the screen, the rank-E z-value for 
the corresponding region of the screen is stored. For 
example, if E is one, each z-value at levels that are coarser 
than the finest level is the next-to-the-farthest z-value for the 
corresponding region of the screen. 

To support culling with K different error limits, it is 
necessary to store K z-values for each z-pyramid cell at 
levels of the pyramid that are coarser than the finest level, 
each of these K z-values corresponding to one of the error 
limits. 

Implementation Issues. 

Although each of the stages in the graphics system 100 of 
FIG. 1 can be implemented in either software or hardware, 
at the present time, it is more practical to implement the 
scene manager 110 in software and to implement the culling 
stage 130 and the z-buffer Tenderer 140 in hardware. Soft- 
ware implementation of the scene manager HO is preferred 
because of the relative complexity of the operations it 
performs and the flexibility that software implementation 
provides. 

Hardware implementation of the culling stage 130 and the 
z-buffer renderer 140 is preferred because, presently, it is not 
practical to attain real-time rendering of very complex 
scenes with software implementations. Although operations 
of the geometric processor 120 can be accelerated by hard- 
ware implementation, a software implementation running on 
the host processor (or another general-purpose processor) 
may provide adequate performance. 

As processor performance improves over time, imple- 
mentation of the entire system in software running on one or 
more general -purpose processors becomes increasingly 
practical. 

Effectiveness of The Present Method of Occlusion Culling. 

The graphics system 100 of FIG. 1 was simulated to 
compare its efficiency to traditional z-buffer systems when 
processing densely occluded scenes. The simulation 
employed a building model which was constructed by 
replicating a polygonal model of an office cubicle. By 
varying the amount of replication, scenes were created with 
depth complexities ranging from 3 to 53. These scene 
models are poorly suited to culling using the "rooms and 
portals" method because of their relatively open geometry. 

A simulation program measured traffic on two classic 
bottlenecks in z-buffer systems: the traffic in polygons that 
need to be processed by the system, which will be referred 
to as geometry traffic, and depth-buffer memory traffic 
generated by depth comparisons, which will be called 
z-traffic and is measured in average number of bits of 
memory traffic per image sample. In the graphics system 100 
of FIG. 1, geometry traffic is the traffic in polygons and cube 
faces on connections 115 and 125 and z-traffic is the 
combined traffic on connections 165 and 175. 

Simulations compared z-buffering to hierarchical 
z-buffering, with and without box culling. The figures cited 
below assume that within the graphics system 100 the 
z-buffer 180 and output image 150 have resolution 1024 by 
1024 and the z-pyramid 170 has resolution 1024 by 1024 
and is organized in five levels of 4x4 tiles which are 
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accessed on a tile-by -tile basis. This system is compared to 
a conventional z-buffer system with a 1024 by 1024 z-buffer 
having 32-bit z-values which are accessed in 4x4 tiles. 

When processing versions of the scene having high depth 
complexity, the amount of geometry traffic was very high 
when box culling was not employed. For example, in a 
version of the scene with a depth complexity of 53, there 
were approximately 9.2 million polygons in the view 
frustum, and without box culling it was necessary to process 
all of these polygons every frame. With box culling and 
near-to-far traversal, it was only necessary to process 
approximately 45,000 polygons per frame, approximately a 
200-fold reduction. 

The advantage of hierarchical z-buffering over conven- 
tional z-buffering is that it reduces z-traffic dramatically, 
assuming favorable traversal order. For example, with box 
culling and near- to -far traversal of bounding boxes, for a 
version of the scene with a depth complexity of 16, 
z-buffering generated approximately 10 times as much 
z-traffic as hierarchical z-buffering using a z-pyramid with 
8-bit z-values. 

When scene depth complexity was increased to 53, 
z-buffering generated approximately 70 times as much 
z-traffic as hierarchical z-buffering using a z-pyramid with 
8-bit z-values. For these scenes, performing box culling with 
conventional z-buffering was not effective at reducing 
z-traffic because boxes overlapped very deeply on the screen 
and the culling of occluded boxes generated a great deal of 
z-traffic. 

The relative advantage of hierarchical z-buffering was 
less when traversal order was less favorable, but even when 
scene geometry was traversed in random order, hierarchical 
z-buffering generated substantially less z-traffic than tradi- 
tional z-buffering. 

Even without box culling, hierarchical z-buffering 
reduced z-traffic substantially. For example, when a version 
of the scene having a depth complexity of 16 was rendered 
without box culling, z-buffering generated approximately 7 
times as much z-traffic as hierarchical z-buffering. 

Next, culling performance was measured when finest- 
level tiles in the z-pyramid were stored as mask-zfar pairs 
with 12-bit zfar values. Tiles at coarser levels of the 
z-pyramid were stored as arrays of 12-bit z-values. 

Compared to using a z-pyramid in which all tiles were 
stored in arrays of 8-bit z-values, this method improved 
cuffing efficiency, thereby reducing geometry traffic, and 
reduced z-traffic by a factor of three or four. Overall, a 
z-pyramid in which finest-level tiles are represented as 
mask-zfar pairs and tiles at coarser levels are represented as 
arrays of low-precision z-values appears to produce the best 
performance. 

While the invention has been described with substantial 
particularity and has been shown with reference to preferred 
forms, or embodiments, it will be understood by those 
skilled in this art that other changes, than those mentioned, 
can be made. Therefore, it is understood that the scope of the 
invention is that defined by the appended claims. 
What is claimed is: 
1. Graphics apparatus comprising: 
a culling stage having an input for receiving a plurality of 
geometric objects for a scene, said culling stage testing 
said objects against a first depth buffer for occlusion 
and non -definitively but conservatively culling objects 
from said plurality of geometric objects which it proves 
to be occluded in said scene; and 
a rendering stage downstream of said culling stage which, 
while said culling stage conservatively culls objects for 
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a given frame of said scene, renders geometric objects 
into said given frame of said scene which were tested 
for occlusion in said culling stage but which were not 
proven upstream of said rendering stage to be occluded, 
5 wherein said culling stage, when occlusion testing 
objects, tests at least one cell of each of the objects in 
said plurality of objects against said first depth buffer. 

2. Apparatus according to claim 1, wherein said culling 
stage further updates said first depth buffer in response to 

]0 objects from said plurality of geometric objects. 

3. Apparatus according to claim 2, wherein said rendering 
stage maintains a second depth buffer, 

further comprising means for updating said first depth 
buffer from said second depth buffer. 

4. Apparatus according to claim 1, wherein said rendering 
stage maintains a second depth buffer, 

further comprising means for updating said first depth 
buffer from said second depth buffer. 

5. Apparatus according to claim 1, wherein said first depth 
20 buffer is hierarchical. 

6. Apparatus according to claim 5, wherein said first depth 
buffer comprises a plurality of levels having progressively 
finer resolution, 

and wherein the resolution of the finest resolution level in 
25 said first depth buffer is coarser than the resolution at 
which said rendering stage renders geometric objects. 

7. Apparatus according to claim 5, wherein said first depth 
buffer comprises a plurality of levels having progressively 
finer resolution, 

30 and wherein said rendering stage maintains a second 
depth buffer having a resolution which is finer than that 
of the finest resolution level of said first depth buffer. 

8. Apparatus according to claim 7, wherein said second 
depth buffer comprises a plurality of levels having progres- 

35 sively finer resolution, and wherein the resolution of the 
finest resolution level in said second depth buffer is finer 
than that of the finest resolution level of said first depth 
buffer. 

9. Apparatus according to claim 1, wherein said rendering 
40 stage maintains a second depth buffer, said second depth 

buffer having a resolution which is finer than that of said first 
depth buffer. 

10. Apparatus according to claim 1, wherein said geo- 
metric objects received by said culling stage have depth 

45 values specified at an average precision which is greater than 
the average precision at which said first depth buffer stores 
depth values. 

11. Apparatus according to claim 1, wherein said geomet- 
ric objects received by said culling stage have depth values 

50 specified at an average precision which is greater than the 
average precision at which said culling stage maintains 
depth values. 

12. Apparatus according to claim 1, wherein said render- 
ing stage maintains a second depth buffer, 

55 and wherein said second depth buffer stores depth values 
at an average precision which is greater than the 
average precision at which said first depth buffer stores 
depth values. 

13. Apparatus according to claim 1, wherein said render- 
60 ing stage maintains a second depth buffer, 

and wherein said culling stage maintains said first depth 
buffer with an average precision which is less than the 
average precision at which said rendering stage main- 
tains said second depth buffer. 
65 14. Apparatus according to claim 1, wherein said render- 
ing stage maintains a second depth buffer different from said 
first depth buffer. 
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15. Apparatus according to claim 1, wherein said culling 24. A method according to claim 20, wherein said first 
stage input receives said plurality of geometric objects in an depth buffer comprises a first hierarchical depth buffer, 
input stream, for use in rendering objects into a first tile of 25. A method according to claim 24, wherein said first 
an image raster, wherein said culling stage: depth buffer comprises a plurality of levels having progres- 

develops from said input stream a Zfar r value indicating 5 sively finer resolution, 

a depth beyond which all cells of subsequently received further comprising the step of rendering said objects 

objects are known to be occluded, at least to the extent passed to said Tenderer at a resolution which is finer 

such cells are within said first tile; and than the finest level of resolution in said first depth 

develops from said input stream a Mask which covers less buffer, 

than all cells of said first tile, and a Zfar^ value 10 26 * A method according to claim 24, wherein said first 

indicating a depth not farther than Zfar^ but beyond depth buffer comprises a plurality of levels having progres- 

which all cells of subsequently received objects are sively finer resolution, 

known to be occluded if such cells are covered by said further comprising the step of rendering said objects 

Mask. passed to said renderer with a second depth buffer 

16. Apparatus according to claim 15, wherein said culling 15 having a resolution which is finer than that of the finest 
stage culls from said input stream a given object in response resolution level of said first depth buffer. 

to determining that all cells on said given object which are 27. A method according to claim 20, wherein said ren- 

covered by said Mask have depth values farther than Zfar^, derer maintains a second depth buffer, said second depth 

and that all cells on said given object which are within said buffer having a resolution which is finer than that of said first 

first tile and which are not covered by said Mask have depth 20 depth buffer. 

values farther than Zfar r . 28. A method according to claim 20, wherein the geo- 

17. Apparatus according to claim 1, wherein said culling metric objects in said input stream have depth values speci- 
stage is implemented as a dedicated unit. fied with greater average precision than the average preci- 

18. Apparatus according to claim 1, further comprising a sion at which said first depth buffer stores depth values, 
scene manager for providing said plurality of geometric 25 29. A method according to claim 20, further comprising 
objects, said scene manager including a depth buffer for the step of rendering said objects passed to said renderer 
culling occluded geometric objects before providing said using a second depth buffer, 

plurality of geometric objects. wherein said second depth buffer stores depth values at an 

19. Apparatus according to claim 1, wherein said first average precision which is greater than the average 
depth buffer is organized as a plurality of tiles, each tile 30 precision at which said first depth buffer stores depth 
having a record indicating: values. 

a coverage mask indicating a region of said tile; and 30. A method according to claim 20, further comprising 

the farthest depth value of any visible sample encountered ^ e ste P s 

on any geometric object within said region of said tile. 35 rendering said objects passed to said renderer using a 

20. A graphics method, for use with an input stream of second depth buffer; and 

geometric objects for a scene, comprising the steps of: updating said first depth buffer in response to objects from 

testing objects from said input stream against a first depth said plurality of geometric objects, including the step of 

buffer for occlusion and non-definitively but conserva- storing depth values in said first depth buffer with an 

tively culling objects from said input stream which can 40 average precision which is less than the average pre- 

be proven to be occluded in said scene; and cision at which depth values are stored in said second 

passing to a renderer downstream of said culling stage, ^^F^ ^^ r * . ^ 

objects tested in said step of testing but not proven with , 31 " A m r ethod J accordin g * claim 20 > father comprising 

said first depth buffer to be occluded, said renderer me step of rend f in S fid objects passed to said renderer 

rendering objects passed to it for a particular frame 45 Jjsing a second depth buffer different from said first depth 

before said step of testing completes with respect to ?f r \ . , , . „ rt 

other objects for the same frame, L 32 ' A method according to claim 20, for use in rendering 

, . CA t . ...^ .j. objects into a first tile of an image raster, further comprising 

wherein said step of tesUng objects from said input stream ^ of dati sajd firs , d „ bufifer in ^ * 

against a first depth buffer for occlusion mcludes the objec(s {mm ^ ^ of trfc obj ^ 

step of testing at least one cell of each of said objects ^ steos of* 

from said input stream against said first depth buffer. * * . ' * • * • ~* 

21. A method according to claim 20, further comprising developing from sa.d input stream a Zfar r value indicat- 
the step of updating said first depth buffer in response to m % ? ^ beyond which all cells of subsequently 
objects from said plurality of geometric objects. •f caved « known . * be °f! uded ; at le f 1 to 

22. A method according to claim 21, further comprising 55 the cxtcnt such ^ are Wlthln sald first tllc ; and 

the steps of- developing from said input stream a Mask which covers 

rendering said objects passed to said renderer using a Us f. th , an a11 f 1 ^ f f , fi / st a ° d a * far " vah \ e 

second depth buffer; and "^TS 8 d , epth , n °L fHth6r , U,an ^ b * bey0nd 

.. _ ., ,,, which all cells of subsequently received objects are 

updating said first depth buffer from said second depth 60 ^own t0 be occluded if such ^ are b said 

buffer. Mask 

23 A method according to claim 20, further comprising 3 3. Amethod according to claim 32, wherein said step of 

steps or. updating further comprises the step of storing said Mask, 

rendering said objects passed to said renderer using a said ZfarT value and said ZfarM value in said first depth 

second depth buffer; and 65 buffer, 

updating said first depth buffer from said second depth 34. Amethod according to claim 32, wherein said step of 

buffer. conservatively culling objects comprises the steps of: 
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determining for a particular object in said input stream 
that all cells on said particular object which are covered 
by said Mask have depth values farther than Zfar^, and 
that all cells on said particular object which are within 
said first tile and which are not covered by said Mask 
have depth values farther than Zfar^ and 

culling said particular object in response to said step of 
determining. 

35. A method according to claim 20, comprising the step 
of providing said input stream only once for said scene. 

36. A method according to claim, 35, further comprising 
the step of pre-culling objects from said input stream, 
upstream of said step of providing. 

37. A method according to claim 20, wherein said step of 
developing a first depth buffer further includes the step of 
maintaining said first depth buffer with a plurality of tiles, 
each tile having a record comprising: 

a coverage mask indicating a region of said tile; and 
the farthest depth value of any visible sample encountered 
on any geometric object within said region of said tile. 

38. A method for use in conservatively culling objects 
from an input stream of objects for a first set of at least two 
image cells, comprising the steps of: 

developing from said input stream a Zfar r value indicat- 
ing a depth beyond which all cells of subsequently 
received objects are known to be occluded, at least to 
the extent such cells are within said first set of cells; 

developing from said input stream a Mask which covers 
less than all cells of said first set of cells, and a Zfar^ 
value indicating a depth not farther than Zfar^ but 30 
beyond which all cells of a subsequently received 
objects are known to be occluded if covered by said 
Mask. 

39. A method according to claim 38, for use with an image 
raster divided into a plurality of tiles each covering a 35 
respective rectangular region of said image cells, wherein 
said first set of image cells consists of one of said tiles. 

40. A method according to claim 38, further comprising 
the step of culling from said input stream a given object in 
response to determining that all cells on said given object 
which are covered by said Mask have depth values farther 
than Zfar^, and that all cells on said given object which are 
within said set of image cells and which are not covered by 
said Mask have depth values farther than Zfar r 

41. A method according to claim 38, further comprising 
the step of culling a given object from said input stream in 
response to determining that all cells on said given object 
which are within said set of image cells have depth values 
farther than Zfar r 

42. A method according to claim 38, further comprising 
the step of rendering a given object from said input stream 
in response to determining that not all cells on said given 
object which are covered by said Mask have depth values 
farther than Zfar^. 

43. A method according to claim 38, further comprising 
the step of rendering a given object from said input stream 
in response to determining that not all cells on said given 
object which are within said set of image cells and which are 
not covered by said Mask have depth values farther than 
Zfary-. 

44. A method according to claim 38, further comprising 
the step of culling from said input stream all given objects 
in response to determining that all cells on the given object 
which are covered by said Mask have depth values farther 
than Zfar^, and that all cells on the given object which are 
within said set of image cells and which are not covered by 
said Mask have depth values farther than Zfar r . 
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45. A method according to claim 38, wherein said steps of 
developing collectively comprise the steps of: 

receiving a current object from said input stream, wherein 
the depth of the farthest cell of said current object 
within said first set of image cells is between Zfar r and 
Zfar^ immediately prior to processing of said current 
object, and wherein said current object covers all cells 
in said first set of image cells not covered by said Mask 
immediately prior to processing of said current object; 
and 

in response to said current object, updating said Zfar r 
value to a value which is derived from and at least as 
far as the depth value of said farthest cell of said current 
object. 

46. A method according to claim 45, wherein said steps of 
developing collectively further comprise the steps of: 

leaving said Mask unchanged in response to said current 
object; and 

leaving said Zfar^ value unchanged in response to said 
current object, 

47. Amethod according to claim 38, wherein said steps of 
developing collectively comprise the steps of: 

receiving a current object from said input stream, wherein 
the depth of the farthest cell of said current object 
within said first set of image cells is between Zf ar r and 
Zfar^ immediately prior to processing of said current 
object, and wherein the union of the cells covered by 
said current object and the cells covered by said Mask 
immediately prior to processing of said current object 
does not completely cover said first set of image cells; 
and 

in response to said current object, updating said Mask to 
indicate said union of the cells covered by said current 
object and the cells covered by said Mask immediately 
prior to processing of said current object. 

48. Amethod according to claim 47, wherein said steps of 
developing collectively further comprise the step of, in 
response to said current object, updating said Zfar^ value to 
a value which is derived from and at least as far as the depth 
value of said farthest cell of said current object. 

49. Amethod according to claim 38, wherein said steps of 
developing collectively further comprise the step of leaving 
said Zfar r value unchanged in response to said current 
object. 

50. Amethod according to claim 38, wherein said steps of 
developing collectively comprise the steps of: 

receiving a current object from said input stream, wherein 
the depth of the farthest cell of said current object 
within said first set of image cells is nearer than Zlai M 
immediately prior to processing of said current object, 
and wherein said current object covers said first set of 
image cells; and 

in response to said current object, updating said Zfar r 
value to a value which is derived from and at least as 
far as the depth value of said farthest cell of said current 
object. 

51. Amethod according to claim 50, wherein said steps of 
developing collectively further comprise the step of clearing 
said Mask in response to said current object. 

52. Amethod according to claim 38, wherein said steps of 
developing collectively comprise the steps of: 

receiving a current object from said input stream, wherein 
the depth of the farthest cell of said current object 
within said first set of image cells is nearer than Zfar^ 
immediately prior to processing of said current object, 
wherein said current object does not cover said first set 
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of image cells but does cover all cells in said first set of 60. A method according to claim 59, further comprising 

image cells not covered by said Mask immediately the steps: 

prior to processing of said current object; and leaving said Mask unchanged in response to said current 

in response to said current object, updating said Mask to object; and 

indicate only the cells covered by said current object. 5 leaving said Zfar^ value unchanged in response to said 

53. A method according to claim 52, wherein said steps of current object. 

developing collectively further comprise the step of in 61. A method according to claim 59, wherein the objects 

response to said current object, updating said Zfar, value to m said stream of objects each cover only cells within said 

a value which is derived from and at least as far as the depth first 561 of ima S e cclls ' 

value of said farthest cell of said current object. 10 62 - A method according to claim 59, wherein said step of 

54. A method according to claim 53, wherein said steps of gating said Zfar r value to a value which is derived from 
developing collectively further comprise the step of in wd at least as far as the de P m value of sajd farthest ^ of 
response to said current object, updating said Zfar r value to said current ob J ect comprises the step of rounding the depth 
a value which is derived from and at least as far as the value value of said farthest cel1 of said current ob J ect to a de P th 
of Zfar^ immediately prior to processing of said current 15 value which is at least as far as said farthest cell of said 
object. current ob J ect - 

55. A method according to claim 38, wherein the depth of 63 A method ac «>rding to claim 59, further comprising 
the farthest cell of said current object within said first set of the ste P of rendc ™g at lcast & 0i * of said current object 
image cells is nearer than Zfar^ immediately prior to which > tf covered by said Mask prior to processing said 
processing of said current object, wherein said current object M cur ™t object, have depth values nearer than Zfar^ and if 
does not cover said first set of image cells but does cover all Wlthin said ^ of ima e e ceUs but not covered by said Mask 
cells in said first set of image cells not covered by said Mask P noT t0 P roccssin g said cu r rc nt object, have depth values 
immediately prior to processing of said current object, nearer than zfar r . 

u • -j * r • * • • rye , 64. A method according to claim 63, wherein said step of 

wherein said step of maintaining a Zfar r value comprises _ , - • „ f* * a • *i * n .if 

f , t . -5 i t_ * i j , ■ 25 rendering compnses the step of rendering at least all those 

the step of in response to said current object, updating „ &11t , rtf ~* rt u_« „ k - u u a *u i 

<~«;a i j . a c a cells of said current object which have depth values nearer 

said Zfar r value to a value which is denved from and man ^ af J v 

at least as far as the value of Zfar^ immediately prior <g A T ' A A . . . . — , . - . 

r -j 4 . . *f 7V 65. A method according to claim 63, wherein said step of 

to processing of said current object. , . . 6 4 c , . . i . n 

*< a maiU n. a „™ j ■ ♦ i ■ i o u • A* * rendering compnses the step of rendering at least all those 

56. A method according to claim 38, wherein said steps of Ml1e . nt * .« * u- u *u- -a * e • 

j i 11 4* i iL * r 30 cells of said current object which are within said set of miaee 

developing collectively comprise the steps of: JU ^ J 6 

receiving a current object from said input stream, wherein 66 . A me thod according to claim 63j wherein said slep of 

the depth of the farthest cell of said current object ren dering comprises the step of rendering all cells of said 

within said first set or image cells is nearer than Zfar^ current object 

immediately prior to processing of said current object, 3$ 6r A mcthod for usc m conservatively objects 

wherein the union of the cells covered by said current from „ mput stream of o5jects for a first ^ of at ^ ^ 

object and the cells covered by said Mask immediately image ^ comprismg thc step of maintaining a Zfar r 

pnor to processing of said current object does not value? a a value and a Mask for said fifst set of - e 

completely cover said first set of image cells; and cells m response t0 objects ^ ^ input stream of objects> 

in response to said current object, updating said Mask to 40 said method comprising the step of, in response to 

indicate said union of the cells covered by said current a one of said input o5jects in wnich the depth 

object and the cells covered by said Mask immediately of thc farthest ccll of said objcct said first 

pnor to processmg of said current object. ^ of image cells ^ between the values of Zfar r and 

57. Amethod according to claim 56, wherein said steps of Z f<ii M immediately prior to processing of said current 
developing collectively further comprise the step of leaving 45 and wherein the union of the cells covered by 
said Zfar^ value unchanged in response to said current said current ob j ect aDd me ^ cove red by said Mask 

^If^ A immediately prior to processing of said current object 

58. Amethod according to claim 56, wherein said steps of does not completely cover said first set of image cells, 
developing collectively further comprise the step of leaving updating said Mask to mdicate said umon of ^ cells 

ob tcf unchan 8 ed in res P onse t0 said current 50 covered by said current object and the cells covered by 

° "Ef \ ,u a r • ,,. , . sa i d Mask immediately prior to processing of said 

59. A method for use in conservatively culling objects current object 

from an input stream of objects for a first set of at least two 68 A method according t0 da im 67, further comprising 

image celb, comprismg the step of maintaining a Zfar r the st of in re to said current object datin said 

value, a Zfar„ value and a Mask for said first set of image 55 ^ vame t0 a value which ^ derived from and a( ^ as 

cells in response to objects m said input stream of objects, far ^ me depth vahie of said farthest ce]1 of said CUfrent 

said method further comprising the step of, in response to object. 

a current one of said input objects in which the depth 69. A method according to claim 68, wherein said step of 

of the farthest cell of said current object within said first updating said Zfar^ value to a value which is derived from 

set of image cells is between the values of Zfar r and 60 and at i cast as f ar as thc depth vahlc of said f arthest ccll of 

Zfar M immediately prior to processing of said current said current object comprises the step of rounding the depth 

object, and wherein said current object covers all cells value of said farthest cell of said current object to a depth 

in said first set of image cells not covered by said Mask value which is at least as far as said farthest cell of said 

immediately prior to processing of said current object, current object, 

updating said Zfar r value to a value which is derived from 65 70. A method according to claim 68, further comprising 

and at least as far as the depth value of said farthest cell the step of leaving said Zfar r value unchanged in response 

of said current object. to said current object. 



09/26/2003, EAST Version: 1.04.0000 



US 6,4 

51 

71. A method according to claim 67, for use with an image 
raster divided into a plurality of tiles each covering a 
respective rectangular region of image cells, 

wherein said first set of image cells consists of one of said 
tiles. 

72. A method according to claim 67, further comprising 
the step of rendering at least those cells of said current object 
which are within said set of image cells but not covered by 
said Mask prior to processing said current object, have depth 
values nearer than Zfar r 

73. A method for use in conservatively culling objects 
from an input stream of objects for a first set of at least two 
image cells, comprising the step of maintaining a Zfar r 
value, a Zfar^ value and a Mask for said first set of image 
cells in response to objects in said input stream of objects, 

said method further comprising the step of, in response to 
a current one of said input objects in which the depth 
of the farthest cell of said current object within said first 
set of image cells is nearer than the value of Zfar^ 
immediately prior to processing of said current object, 
and wherein said current object covers said first set of 
image cells, 

updating said Zfar r value to a value which is derived from 
and at least as far as the depth value of said farthest cell 
of said current object. 

74. A method according to claim 73, further comprising 
the step of clearing said Mask in response to said current 
object. 

75. A method according to claim 73, wherein said step of 
updating said Zfar r value to a value which is derived from 
and at least as far as the depth value of said farthest cell of 
said current object comprises the step of rounding the depth 
value of said farthest cell of said current object to a depth 
value which is at least as far as said farthest cell of said 
current object. 

76. A method according to claim 73, further comprising 
the step of rendering at least those cells of said current object 
which are within said set of image cells, 

77. A method for use in conservatively culling objects 
from an input stream of objects for a first set of at least two 
image cells, comprising the step of maintaining a Zfar r 
value, a Zfar^ value and a Mask for said first set of image 
cells in response to objects in said input stream of objects, 

said method further comprising the step of, in response to 
a current one of said input objects in which the depth 
of the farthest cell of said current object within said first 
set of image cells is nearer than the value of Zfar^ 
immediately prior to processing of said current object, 
wherein said current object does not cover said first set 
of image cells but does cover all cells in said first set of 
image cells not covered by said Mask immediately 
prior to processing of said current object, 

updating said Mask to indicate only the cells in said first 
set of cells which are covered by said current object. 

78. A method according to claim 77, further comprising 
the step of in response to said current object, updating said 
Zfar^ value to a value which is derived from and at least as 
far as the depth value of said farthest cell of said current 
object. 

79. A method according to claim 78, wherein said step of 
updating said Zfar^ value to a value which is derived from 
and at least as far as the depth value of said farthest cell of 
said current object comprises the step of rounding the depth 
value of said farthest cell of said current object to a depth 
value which is at least as far as said farthest cell of said 
current object. 
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80. A method according to claim 78, further comprising 
the step of in response to said current object, updating said 
Zfar r value to a value which is derived from and at least as 
far as the value of Zfar^ immediately prior to processing of 

5 said current object. 

81. A method according to claim 77, further comprising 
the step of rendering at least those cells of said current object 
which are within said set of image cells. 

82. A method for use in conservatively culling objects 
10 from an input stream of objects for a first set of at least two 

image cells, comprising the step of maintaining a Zfar r 
value, a Zfar^ value and a Mask for said first set of image 
cells in response to objects in said input stream of objects, 
said method further comprising the step of, in response to 
15 a current one of said input objects in which the depth 
of the farthest cell of said current object within said first 
set of image cells is nearer than the value of Zfar^ 
immediately prior to processing of said current object, 
wherein said current object does not cover said first set 
20 of image cells but does cover all cells in said first set of 
image cells not covered by said Mask immediately 
prior to processing of said current object, 
updating said Zfar r value to a value which is derived from 
^ and at least as far as the value of Zfar^ immediately 
prior to processing of said current object. 

83. A method according to claim 82, further comprising 
the step of rendering at least those cells of said current object 
which are within said set of image cells. 

^ 84. A method for use in conservatively culling objects 
from an input stream of objects for a first set of at least two 
image cells, comprising the step of maintaining a Zfar r 
value, a Zfar^ value and a Mask for said first set of image 
cells in response to objects in said input stream of objects, 
35 said method further comprising the step of, in response to 
a current one of said input objects in which the depth 
of the farthest cell of said current object within said first 
set of image cells is nearer than the value of Zfar^ 
immediately prior to processing of said current object, 
4Q wherein the union of the cells covered by said current 
object and the cells covered by said Mask immediately 
prior to processing of said current object does not 
completely cover said first set of image cells, 
updating said Mask to indicate said union of the cells 
45 covered by said current object and the cells covered by 
said Mask immediately prior to processing of said 
current object. 

85. A method according to claim 84, further comprising 
the step of leaving said Zfar M value unchanged in response 

50 to said current object. 

86. A method according to claim 84, further comprising 
the step of leaving said Zfar r value unchanged in response 
to said current object. 

87. A method according to claim 84, further comprising 
55 the step of rendering at least those cells of said current obj ect 

which are within said set of image cells. 

88. A method for conservatively culling objects from an 
input stream of objects for a first set of at least two image 
cells, comprising the step of processing said objects in 

6Q sequence, said step of processing including, for each current 
one of said objects, the steps of: 
(a) maintaining a Zfar r value (i) which, if the portion of 
the far clipping plane that covers the first set of image 
cells has not yet been proven occluded, is derived from 
65 and at least as far as the depth of the far clipping plane, 
and (ii) which, if the portion of the far clipping plane 
that is in the first set of image cells has already been 
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proven occluded, is derived from and at least as far as 
the depth value of the farthest cell of all the objects in 
said stream of objects which have been processed up to 
and including said current object, which cell is within 
said first set of image cells and which cell has not been 
proven occluded; 

(b) maintaining a Mask which, when neither empty nor 
full, identifies the union of coverage of said first set of 
image cells by cells of at least one of the objects in said 
stream of objects, which cells have not been proven 
occluded, and which cells are nearer than Zfar r ; and 

(c) maintaining a Zfar^ value which is derived from and 
at least as far as the farthest depth value of all cells 
represented in the Mask. 

89. A method according to claim 88, further comprising 
the step of processing a further object for said first set of 
image cells, said further object being additional to said input 
stream of objects. 

90. A method according to claim 88, for use with an image 
raster divided into a plurality of subsets of cells including 
said first set of image cells, each subset having more than 
one cell, the objects in said stream of objects each covering 
only cells within a respective single one of said subsets of 
cells, 

further comprising the step of, prior to said step of 
processing a given one of said objects, dividing a 
predecessor object into a plurality of objects each 
covering only cells within a respective one of said 
subsets of cells. 

91. A method according to claim 88, wherein said step of 
maintaining a Zfar r value, if the portion of the far clipping 
plane that is in the first set of image cells has already been 
proven occluded, comprises the step of rounding the depth 
value of the farthest cell which is within said first set of 
image cells and which has not been proven occluded, from 
all the objects in said stream of objects which have been 
processed up to and including said current object, to a depth 
value which is at least as far as said farthest cell. 

92. A method according to claim 88, wherein said step of 
maintaining a Zfar^ value comprises the step of rounding 
the farthest depth value of all cells represented in the Mask 
to a depth value which at least as far as said farthest depth 
value. 

93. A method according to claim 88, wherein said step of 
maintaining aMask comprises the step of updating said mask 
only in response to objects which are entirely nearer than 
Zfar r . 

94. A method according to claim 88, wherein said step of 
maintaining aMask comprises the step of maintaining said 
Mask such that when it is empty, said Mask indicates that 
none of the objects in said stream of objects which have been 
processed up to and including said current object, are nearer 
than Zfar r , 

95. A method according to claim 88, further comprising 
the step of, in response to a given one of said objects in said 
input stream of objects and prior to said steps of 
maintaining, 
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conservatively testing from said given object each cell 
which is within the given set of image cells, for 
occlusion by previous objects from said input stream of 
objects. 

96. A method according to claim 95, further comprising 
the step of passing said given object to a Tenderer in response 
to said step of conservatively testing failing to prove said 
given object occluded. 

97. A method according to claim 88, further comprising 
the step of culling a given object from said stream of objects 
in response to determining that all cells on said given object 
which are covered by said Mask have depth values farther 
than Zfar^, and that all cells on said given object which are 
within said set of image cells and which are not covered by 
said Mask have depth values farther than Zfar r . 

98. A method according to claim 88, further comprising 
the step of rendering a given object from said stream of 
objects in response to determining that not all cells on said 
given object which are covered by said Mask have depth 
values farther than Zfar^. 

99. A method according to claim 88, further comprising 
the step of rendering a given object from said stream of 
objects in response to determining that not all cells on said 
given object which are within said set of image cells and 
which are not covered by said Mask have depth values 
farther than Zfar r . 

100. A method according to claim 88, further comprising 
the step of culling a given object from said stream of objects 
in response to determining that all cells on said given object 
which are within said set of image cells have depth values 
farther than Zfar r . 

101. Graphics apparatus comprising: 

a dedicated culling stage having an input for receiving a 
plurality of geometric objects for a scene, said culling 
stage non-definitively but conservatively culling 
objects from said plurality of geometric objects which 
it proves with a first depth buffer to be occluded in said 
scene; and 

a rendering stage downstream of said culling stage, said 
rendering stage rendering geometric objects not proven 
upstream of said rendering stage to be occluded, 

wherein said culling stage, in culling objects, tests at least 
one cell of each of the objects in said plurality of 
objects against said first depth buffer. 

102. A graphics method, for use with an input stream of 
geometric objects for a scene, comprising the steps of: 

non-definitively but conservatively culling, in a dedicated 

culling stage, objects from said input stream which can 

be proven with a first depth buffer to be occluded in 

said scene; and 
passing to a renderer downstream of said culling stage, 

objects from said input stream not proven with said first 

depth buffer to be occluded, 
wherein said step of culling includes the step of testing at 

least one cell of each of said objects from said input 

stream against said first depth buffer. 
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